nexusLIMS.schemas.metadata#
Type-specific metadata schemas for NexusLIMS extractor plugins.
This module defines Pydantic models for validating metadata extracted from different types of microscopy data (Image, Spectrum, SpectrumImage, Diffraction). Each schema uses Pint Quantity objects for physical measurements, EM Glossary field names, and supports flexible extension fields.
The schemas follow a hierarchical structure:
NexusMetadata- Base schema with common fieldsImageMetadata- SEM/TEM/STEM image dataSpectrumMetadata- EDS/EELS spectral dataSpectrumImageMetadata- Hyperspectral data (inherits from both)DiffractionMetadata- Diffraction pattern data
Key Features:
Pint Quantity fields for machine-actionable units
EM Glossary v2.0.0 terminology (see EM Glossary Field Reference)
Automatic unit normalization to preferred units
Flexible extensions section for instrument-specific metadata
Strict validation of core fields
Examples:
Validate SEM image metadata:
>>> from nexusLIMS.schemas.metadata import ImageMetadata
>>> from nexusLIMS.schemas.units import ureg
>>>
>>> meta = ImageMetadata(
... creation_time="2024-01-15T10:30:00-05:00",
... data_type="SEM_Imaging",
... dataset_type="Image",
... acceleration_voltage=ureg.Quantity(10, "kilovolt"),
... working_distance=ureg.Quantity(10, "millimeter"),
... beam_current=ureg.Quantity(100, "picoampere"),
... )
>>> print(meta.acceleration_voltage)
10.0 kilovolt
Validate spectrum metadata:
>>> from nexusLIMS.schemas.metadata import SpectrumMetadata
>>>
>>> spec_meta = SpectrumMetadata(
... creation_time="2024-01-15T10:30:00-05:00",
... data_type="EDS_Spectrum",
... dataset_type="Spectrum",
... acquisition_time=ureg.Quantity(30, "second"),
... live_time=ureg.Quantity(28.5, "second"),
... )
Use extensions for instrument-specific fields:
>>> meta_with_ext = ImageMetadata(
... creation_time="2024-01-15T10:30:00-05:00",
... data_type="SEM_Imaging",
... dataset_type="Image",
... extensions={
... "facility": "Nexus Microscopy Center",
... "detector_brightness": 50.0,
... "scan_speed": 6,
... }
... )
For detailed documentation on the metadata schema system, see:
Internal Metadata Schema System - Schema architecture, migration guide, and usage examples
EM Glossary Field Reference - Complete EM Glossary field reference and mapping tables
Module Contents#
Classes#
Metadata about the NexusLIMS extraction process. |
|
Stage position with coordinates and tilt angles. |
|
Base schema for all NexusLIMS metadata. |
|
Schema for image dataset metadata (SEM, TEM, STEM, FIB, HIM). |
|
Schema for spectrum dataset metadata (EDS, EELS, etc.). |
|
Schema for spectrum image (hyperspectral) dataset metadata. |
|
Schema for diffraction pattern dataset metadata (TEM, EBSD, etc.). |
Functions#
Create a Pydantic Field with EM Glossary metadata. |
API#
- nexusLIMS.schemas.metadata.emg_field(field_name: str, default: Any = None, *, description: str | None = None, **kwargs: Any) Any[source]#
Create a Pydantic Field with EM Glossary metadata.
This helper automatically adds EM Glossary semantic annotations to field definitions, including EMG ID, URI, and label. It pulls metadata from the
em_glossarymodule to maintain a single source of truth.- Parameters:
field_name (str) – Internal field name (e.g., “acceleration_voltage”). Used to look up EMG metadata from
em_glossarymodule.default (Any, optional) – Default value for the field. Use
...for required fields,Nonefor optional fields.description (str, optional) – Field description. If not provided, uses description from
em_glossarymodule.**kwargs (Any) –
Additional keyword arguments passed to
pydantic.fields.Field(), such as:alias: Override display name (default from em_glossary)gt,ge,lt,le: Numeric constraintsexamples: Example values for documentationjson_schema_extra: Additional JSON schema metadata (merged with EMG data)
- Returns:
Configured Pydantic field with EMG metadata
- Return type:
Examples:
Create a field with automatic EMG metadata:
>>> from nexusLIMS.schemas.metadata import emg_field >>> from nexusLIMS.schemas.pint_types import PintQuantity >>> >>> class MySchema(BaseModel): ... dwell_time: PintQuantity | None = emg_field("dwell_time")
The field automatically gets:
alias: “Dwell Time” (display name)description: “Time period during which the beam remains at one position.”json_schema_extra:{"emg_id": "EMG_00000015", "emg_uri": "...", ...}
Override description:
>>> acceleration_voltage: PintQuantity | None = emg_field( ... "dwell_time", ... description="Custom description", ... )
Add additional JSON schema metadata:
>>> beam_current: PintQuantity | None = emg_field( ... "beam_current", ... json_schema_extra={"units": "second", "typical_range": "1e-3 to 1"}, ... )
Notes:
Fields without EMG mappings still get display names and descriptions
EMG metadata is only added if the field has a valid EMG ID
The alias (display name) comes from em_glossary for consistency
All EMG metadata is stored in json_schema_extra for JSON schema export
- class nexusLIMS.schemas.metadata.ExtractionDetails(/, **data: Any)[source]#
Bases:
pydantic.BaseModelMetadata about the NexusLIMS extraction process.
Records when metadata was extracted, which extractor module was used, and the NexusLIMS version.
- date#
Type: str
ISO-8601 formatted timestamp with timezone indicating when the metadata extraction occurred.
- module#
Type: str
Fully qualified Python module name of the extractor that processed this file. Examples:
'nexusLIMS.extractors.plugins.digital_micrograph','nexusLIMS.extractors.plugins.quanta_tif'
- version#
Type: str
NexusLIMS version string used for extraction. Example:
'1.2.3'
- extractor_warnings#
Type: str | None
Warning or error messages from the extraction process
- model_config#
Type: dict
Pydantic model configuration:
populate_by_name: True– Accept both Python field names and JSON aliases
- class nexusLIMS.schemas.metadata.StagePosition(/, **data: Any)[source]#
Bases:
pydantic.BaseModelStage position with coordinates and tilt angles.
Represents the physical position and orientation of the microscope stage. All fields use Pint Quantity objects with appropriate units and are optional to accommodate different stage configurations.
Examples:
>>> from nexusLIMS.schemas.metadata import StagePosition >>> from nexusLIMS.schemas.units import ureg >>> >>> pos = StagePosition( ... x=ureg.Quantity(100, "um"), ... y=ureg.Quantity(200, "um"), ... z=ureg.Quantity(5, "mm"), ... tilt_alpha=ureg.Quantity(10, "degree"), ... ) >>> print(pos.x) 100 micrometer
Notes:
Some microscopes may not have all degrees of freedom. Single-tilt stages will only have tilt_alpha, while dual-tilt stages (e.g., tomography holders) will have both tilt_alpha and tilt_beta.
- x#
Type:
PintQuantity| NoneStage X coordinate. Preferred unit: micrometer (µm)
- y#
Type:
PintQuantity| NoneStage Y coordinate. Preferred unit: micrometer (µm)
- z#
Type:
PintQuantity| NoneStage Z coordinate (height). Preferred unit: millimeter (mm)
- rotation#
Type:
PintQuantity| NoneStage rotation angle around Z axis. Preferred unit: degree (°)
- tilt_alpha#
Type:
PintQuantity| NoneTilt angle along the stage’s primary tilt axis (alpha). Preferred unit: degree (°)
- tilt_beta#
Type:
PintQuantity| NoneTilt angle along the stage’s secondary tilt axis (beta), if the stage is capable of dual-axis tilting. Preferred unit: degree (°)
- model_config#
Pydantic model configuration:
extra: "allow"– Allow additional vendor-specific stage positions
- class nexusLIMS.schemas.metadata.NexusMetadata(/, **data: Any)[source]#
Bases:
pydantic.BaseModelBase schema for all NexusLIMS metadata.
This is the foundation schema that all type-specific schemas inherit from. It defines the required fields common to all dataset types and provides the extension mechanism for instrument-specific metadata.
Notes:
The extensions section allows arbitrary metadata while maintaining strict validation on core fields. This hybrid approach ensures:
Core fields are consistent and validated
Instrument-specific metadata is preserved
No data loss during extraction
Extensions should use descriptive key names and avoid conflicts with core field names.
- creation_time#
Type: str
ISO-8601 formatted timestamp with timezone indicating when the data was acquired. Must include timezone offset (+00:00, -05:00) or ‘Z’. Examples: “2024-01-15T10:30:00-05:00”, “2024-01-15T15:30:00Z”
- data_type#
Type: str
Human-readable description of the data type using underscore-separated components. Examples: “STEM_Imaging”, “TEM_EDS”, “SEM_Imaging”
- dataset_type#
Type: typing.Literal[Image, Spectrum, SpectrumImage, Diffraction, Misc, Unknown]
Schema-defined category matching the Nexus Experiment XML schema type attribute.
- data_dimensions#
Type: str | None
String representation of data shape as a tuple. Examples: “(1024, 1024)”, “(2048,)”, “(12, 1024, 1024)”
- instrument_id#
Type: str | None
NexusLIMS persistent identifier for the instrument. Examples: “FEI-Titan-TEM-635816”, “Quanta-FEG-650-SEM-555555”
- warnings#
Type: list[str | list[str]]
Field names flagged as unreliable. These are marked with warning=”true” in the XML output.
- nexuslims_extraction#
Type: nexusLIMS.schemas.metadata.ExtractionDetails | None
NexusLIMS extraction metadata containing date, module, and version information about when and how the metadata was extracted.
- extensions#
Type: typing.Dict[str, typing.Any]
Flexible container for instrument-specific metadata that doesn’t fit the core schema. Use this for vendor-specific fields, facility metadata, or experimental parameters not covered by EM Glossary.
- model_config#
Pydantic model configuration:
populate_by_name: True– Accept both Python field names and JSON aliasesextra: "forbid"– Forbid extra fields (forces use of extensions dict for additional data)
- class nexusLIMS.schemas.metadata.ImageMetadata(/, **data: Any)[source]#
Bases:
nexusLIMS.schemas.metadata.NexusMetadataSchema for image dataset metadata (SEM, TEM, STEM, FIB, HIM).
Extends
NexusMetadatawith fields specific to 2D image acquisition. Uses Pint Quantity objects for all physical measurements.Examples:
>>> from nexusLIMS.schemas.metadata import ImageMetadata >>> from nexusLIMS.schemas.units import ureg >>> >>> meta = ImageMetadata( ... creation_time="2024-01-15T10:30:00-05:00", ... data_type="SEM_Imaging", ... dataset_type="Image", ... acceleration_voltage=ureg.Quantity(15, "kV"), ... working_distance=ureg.Quantity(10.5, "mm"), ... beam_current=ureg.Quantity(50, "pA"), ... magnification=5000.0, ... )
- dataset_type#
Type: typing.Literal[Image]
- acceleration_voltage#
Type:
PintQuantity| NoneAccelerating voltage of the electron/ion beam. Preferred unit: kilovolt (kV). EM Glossary: EMG_00000004
- working_distance#
Type:
PintQuantity| NoneDistance between final lens and sample surface. Preferred unit: millimeter (mm). EM Glossary: EMG_00000050
- beam_current#
Type:
PintQuantity| NoneElectron beam current. Preferred unit: picoampere (pA). EM Glossary: EMG_00000006
- emission_current#
Type:
PintQuantity| NoneEmission current from electron source. Preferred unit: microampere (µA). EM Glossary: EMG_00000025
- dwell_time#
Type:
PintQuantity| NoneTime the beam dwells on each pixel during scanning. Preferred unit: microsecond (µs). EM Glossary: EMG_00000015
- magnification#
Type: float | None
Nominal magnification (dimensionless).
- horizontal_field_width#
Type:
PintQuantity| NoneWidth of the scanned area. Preferred unit: micrometer (µm)
- vertical_field_width#
Type:
PintQuantity| NoneHeight of the scanned area. Preferred unit: micrometer (µm)
- pixel_width#
Type:
PintQuantity| NonePhysical width of a single pixel. Preferred unit: nanometer (nm)
- pixel_height#
Type:
PintQuantity| NonePhysical height of a single pixel. Preferred unit: nanometer (nm)
- scan_rotation#
Type:
PintQuantity| NoneRotation angle of the scan frame. Preferred unit: degree (°)
- detector_type#
Type: str | None
Type or name of detector used. Examples: “ETD”, “InLens”, “HAADF”, “BF”
- acquisition_device#
Type: str | None
Name of the acquisition device or camera. Examples: “BM-UltraScan”, “K2 Summit”
- stage_position#
Type: nexusLIMS.schemas.metadata.StagePosition | None
Stage coordinates and tilt angles. See
StagePositionfor details. Preferred units: x/y in µm, z in mm, angles in degrees
- class nexusLIMS.schemas.metadata.SpectrumMetadata(/, **data: Any)[source]#
Bases:
nexusLIMS.schemas.metadata.NexusMetadataSchema for spectrum dataset metadata (EDS, EELS, etc.).
Extends
NexusMetadatawith fields specific to spectral data acquisition.Examples:
>>> from nexusLIMS.schemas.metadata import SpectrumMetadata >>> from nexusLIMS.schemas.units import ureg >>> >>> meta = SpectrumMetadata( ... creation_time="2024-01-15T10:30:00-05:00", ... data_type="EDS_Spectrum", ... dataset_type="Spectrum", ... acquisition_time=ureg.Quantity(30, "s"), ... live_time=ureg.Quantity(28.5, "s"), ... channel_size=ureg.Quantity(10, "eV"), ... )
- dataset_type#
Type: typing.Literal[Spectrum]
- acquisition_time#
Type:
PintQuantity| NoneTotal time for spectrum acquisition. Preferred unit: second (s). EM Glossary: EMG_00000055
- live_time#
Type:
PintQuantity| NoneLive time excluding dead time. Preferred unit: second (s)
- detector_energy_resolution#
Type:
PintQuantity| NoneEnergy resolution of the detector. Preferred unit: electronvolt (eV)
- channel_size#
Type:
PintQuantity| NoneEnergy width of each channel. Preferred unit: electronvolt (eV)
- starting_energy#
Type:
PintQuantity| NoneStarting energy of the spectrum. Preferred unit: kiloelectronvolt (keV)
- azimuthal_angle#
Type:
PintQuantity| NoneAzimuthal angle of the detector. Preferred unit: degree (°)
- elevation_angle#
Type:
PintQuantity| NoneElevation angle of the detector. Preferred unit: degree (°)
- takeoff_angle#
Type:
PintQuantity| NoneX-ray takeoff angle. Preferred unit: degree (°)
- elements#
Type: list[str] | None
Detected elements (e.g., [“Fe”, “Cr”, “Ni”])
- class nexusLIMS.schemas.metadata.SpectrumImageMetadata(/, **data: Any)[source]#
Bases:
nexusLIMS.schemas.metadata.ImageMetadata,nexusLIMS.schemas.metadata.SpectrumMetadataSchema for spectrum image (hyperspectral) dataset metadata.
Combines fields from both
ImageMetadataandSpectrumMetadatasince spectrum images have both spatial and spectral dimensions. Inherits all fields from both parent classes.Examples:
>>> from nexusLIMS.schemas.metadata import SpectrumImageMetadata >>> from nexusLIMS.schemas.units import ureg >>> >>> meta = SpectrumImageMetadata( ... creation_time="2024-01-15T10:30:00-05:00", ... data_type="STEM_EDS_SpectrumImage", ... dataset_type="SpectrumImage", ... acceleration_voltage=ureg.Quantity(200, "kV"), # Image field ... acquisition_time=ureg.Quantity(1200, "s"), # Spectrum field ... pixel_time=ureg.Quantity(0.5, "s"), # SpectrumImage specific ... )
- dataset_type#
Type: typing.Literal[SpectrumImage]
- pixel_time#
Type:
PintQuantity| NoneTime spent acquiring spectrum at each pixel. Preferred unit: second (s)
- scan_mode#
Type: str | None
Scanning mode used for acquisition. Examples: “raster”, “serpentine”, “fly-back”
- validate_spectrum_image_fields() SpectrumImageMetadata[source]#
Ensure SpectrumImage has both image and spectrum metadata.
- class nexusLIMS.schemas.metadata.DiffractionMetadata(/, **data: Any)[source]#
Bases:
nexusLIMS.schemas.metadata.NexusMetadataSchema for diffraction pattern dataset metadata (TEM, EBSD, etc.).
Extends
NexusMetadatawith fields specific to diffraction data.Examples:
>>> from nexusLIMS.schemas.metadata import DiffractionMetadata >>> from nexusLIMS.schemas.units import ureg >>> >>> meta = DiffractionMetadata( ... creation_time="2024-01-15T10:30:00-05:00", ... data_type="TEM_Diffraction", ... dataset_type="Diffraction", ... camera_length=ureg.Quantity(200, "mm"), ... convergence_angle=ureg.Quantity(0.5, "mrad"), ... acceleration_voltage=ureg.Quantity(200, "kV"), ... )
- dataset_type#
Type: typing.Literal[Diffraction]
- camera_length#
Type:
PintQuantity| NoneCamera length for diffraction pattern. Preferred unit: millimeter (mm). EM Glossary: EMG_00000008
- convergence_angle#
Type:
PintQuantity| NoneConvergence angle of the electron beam. Preferred unit: milliradian (mrad). EM Glossary: EMG_00000010
- acceleration_voltage#
Type:
PintQuantity| NoneAccelerating voltage (also relevant for diffraction). Preferred unit: kilovolt (kV). EM Glossary: EMG_00000004
- acquisition_device#
Type: str | None
Name of the detector/camera used.