(schema_documentation)= # XML Schema Reference This page provides automatically-generated documentation for the NexusLIMS XML schema used to represent experimental records. ## Introduction NexusLIMS generates output experimental records as XML documents that conform to the [**Nexus Experiment Schema**](https://doi.org/10.18434/M32245) (XSD). These records can then be curated in the [NexusLIMS CDCS](https://github.com/datasophos/nexuslims-cdcs/) frontend application, and capture comprehensive metadata about microscopy sessions, including: - **Session information**: Who performed the experiment, when, and why - **Instrument details**: Which microscope(s) were used and their configurations - **Acquisition activities**: Temporal grouping of related data files - **Dataset metadata**: Comprehensive technical parameters for each file ### From Internal to Output XML Representations ```{versionadded} 2.2.0 Starting with v2.2.0, NexusLIMS uses a **two-stage metadata pipeline**. This validation and standardization system is still evolving as EM Glossary coverage expands and community standards develop. ``` 1. **Extraction & Validation (Internal)**: NexusLIMS extractors produce metadata dictionaries validated against internal Pydantic schemas ({py:class}`~nexusLIMS.schemas.metadata.NexusMetadata`, {py:class}`~nexusLIMS.schemas.metadata.ImageMetadata`, {py:class}`~nexusLIMS.schemas.metadata.SpectrumMetadata`, etc.) with: - EM Glossary standardized fields (where available - the vocabulary is still developing) - Physical quantities with units (via Pint) - Basic validation and unit normalization 2. **Serialization to XML**: Validated metadata is transformed into XML following the [*Nexus Experiment Schema*](https://doi.org/10.18434/M32245) with: - Quantity fields → XML elements with `unit` attributes - Extension fields → `` elements in `` blocks - EM Glossary metadata → `emg:display_name` and `emg:id` attributes on quantity elements (where EMG terms are available) #### Example Transformation ```python # Pydantic metadata (Python objects) { "acceleration_voltage": Quantity(200, "kV"), "magnification": Quantity(50, "kiloX"), "extensions": { "detector_brightness": 50.0, } } ``` #### Metadata Serialized to XML (within a Dataset element) ```xml micrograph_001.tif /path/to/data/micrograph_001.tif image/tiff 200.0 50 50.0 ``` This two-stage approach provides: - **Basic type safety** during extraction (Pydantic validation) - **Emerging standardization** via EM Glossary (where terms are available) - **Interoperability** through standard XML format - **Semantic enrichment** with EMG metadata attributes (where EMG IDs are available) See {doc}`nexuslims_internal_schema` for complete details on the Pydantic schema system and {doc}`em_glossary_reference` for EM Glossary integration. ## Nexus Experiment Schema The Nexus Experiment schema defines the final output structure of XML records generated by NexusLIMS. Each record describes a microscopy session, including metadata about the experiment, instruments used, and data files acquired. ```{eval-rst} .. xsddoc:: ../nexusLIMS/schemas/nexus-experiment.xsd ``` ## Example Record Here is a simplified example of a complete NexusLIMS XML record showing the key structural elements: ```xml STEM imaging session - nanoparticle characterization session_20240115_jsmith Dr. Jane Smith John Doe FEI Titan TEM 2024-01-15T10:00:00-05:00 2024-01-15T12:00:00-05:00 Characterize particle size distribution and elemental composition of Au-TiO2 nanoparticles Gold-Titania Nanoparticles Batch 5 Au nanoparticles supported on TiO2, synthesized January 2024 Nanoparticle Catalysis Study NP-2024-001 2024-01-15T10:30:00-05:00 Au-TiO2-batch5 STEM HAADF overview_001.dm3 2024/01/15/overview_001.dm3 application/gatan-dm3 Overview STEM image showing particle distribution 2024/01/15/overview_001.dm3.thumb.png STEM_Imaging 2024-01-15T10:30:45-05:00 200.0 500 50000 5.2 10.0 1.25 -0.75 0.0 BM-Ceta ``` **Key Elements:** - **``**: Root element containing entire record - **``**: Session-level metadata (who, what, why) - **``**: Instrument identification and location - **``**: Temporal grouping of related files - **``**: Individual file metadata - **``**: Technical parameters with units and EM Glossary attributes - **``**: Vendor-specific metadata not in standard schema - **``**: Thumbnail image location ## Navigating the XSD Documentation The auto-generated XSD documentation below provides comprehensive details on all schema elements and types. Here's how to navigate it effectively: ### Schema Structure The Nexus Experiment Schema is organized hierarchically: 1. **Root Element**: `Experiment` - Top-level container for all record data 2. **Session Metadata**: `summary` complex type - Who, when, what, why 3. **Instrument Information**: `instrument` complex type - Microscope details 4. **Activities**: `AcquisitionActivity` complex type - Temporal file groupings 5. **Datasets**: `dataset` complex type - Individual file metadata 6. **Acquisition Parameters**: `acquisition` complex type - Technical metadata with units ### Finding Specific Information **To find required vs. optional elements:** - Look for `minOccurs="0"` (optional) or `minOccurs="1"` (required) - Elements without `minOccurs` specified default to `minOccurs="1"` (required) **To understand element types:** - Simple types (string, integer, etc.) are defined inline - Complex types (nested elements) reference type definitions (e.g., `type="datasetType"`) - Click type references to see full structure **To see allowed values:** - Enumerated types use `` with `` values - Example: `DatasetType` allows only: Image, Spectrum, SpectrumImage, Diffraction, Misc, Unknown **To understand attributes:** - Many quantity elements include a `unit` attribute for physical units - EM Glossary elements include optional `emg:display_name` and `emg:id` attributes ### Common Elements Reference | **Element Path** | **Purpose** | **Required** | |------------------|-------------|--------------| | `Experiment/summary/title` | Session title | Yes | | `Experiment/summary/experimenter` | User information | Yes | | `Experiment/instrument` | Microscope used | Yes | | `Experiment/AcquisitionActivity` | File grouping | Yes (1+) | | `AcquisitionActivity/dataset` | Individual file | Yes (1+) | | `dataset/acquisition` | Technical metadata | No | | `dataset/acquisition/extensions` | Vendor-specific data | No | ### Physical Quantities with Units Metadata elements representing physical quantities use the `meta` element with a `unit` attribute: ```xml 200.0 5.2 10.0 ``` - **`name`** attribute: Human-readable parameter name (often from EM Glossary) - **`unit`** attribute: Unit symbol (e.g., `"kV"`, `"nm"`, `"s"`) - **Element value**: Numeric value in the specified unit For dimensionless quantities, the `unit` attribute is omitted: ```xml 50000 ``` Common units: - Voltages: `kV` (kilovolts) - Distances: `nm` (nanometers), `mm` (millimeters), `um` (micrometers) - Times: `s` (seconds), `ms` (milliseconds), `us` (microseconds) - Energies: `eV` (electron volts), `keV` (kiloelectron volts) - Angles: `deg` (degrees), `rad` (radians), `mrad` (milliradians) See {doc}`nexuslims_internal_schema` for the complete list of preferred units and unit handling details. ## See Also - {doc}`nexuslims_internal_schema` - Internal backend metadata schemas and validation - {doc}`em_glossary_reference` - EM Glossary field mappings - {doc}`../user_guide/record_building` - How records are generated - {doc}`../user_guide/taxonomy` - Data type classification