NexusLIMS Database#

Overview#

NexusLIMS uses a SQLite database to track experimental sessions and instrument configuration. The database file location is specified by NX_DB_PATH.

Key features:

Single-file database for easy backup (just copy the file)
Created from SQLModel ORM Definitions
Inspectable with database tools such as DB Browser for SQLite

SQLModel ORM Migration#

Added in version 2.2.0: NexusLIMS now uses SQLModel for database operations instead of raw SQL queries. This provides type safety, automatic datetime handling, and cleaner code.

Notable Changes#

The db_query() function for manual database queries has been completely removed from nexusLIMS.db.session_handler
EventType and RecordStatus are now enums (EventType, RecordStatus) instead of string literals
SessionLog and Instrument are now SQLModel classes with real object-relational mapping
The Instrument dataclass has been replaced with a SQLModel class - use instrument_pid field instead of name

Migration Examples#

Creating session log entries:

# New approach: has not changed significantly other than using enums
from nexusLIMS.db.models import SessionLog
from nexusLIMS.db.enums import EventType, RecordStatus
from datetime import datetime as dt

log = SessionLog(
    session_identifier="abc",
    instrument="FEI-Titan-TEM",
    timestamp=dt.fromisoformat("2025-01-15T10:00:00"),
    event_type=EventType.START,
    record_status=RecordStatus.TO_BE_BUILT,
    user="alice"
)
log.insert_log()

Querying the database:

# Old approach (deprecated)
_, results = db_query("SELECT * FROM session_log WHERE record_status = ?", ("TO_BE_BUILT",))

# New approach
from sqlmodel import Session, select
from nexusLIMS.db.engine import get_engine
from nexusLIMS.db.models import SessionLog
from nexusLIMS.db.enums import RecordStatus

with Session(get_engine()) as session:
    results = session.exec(
        select(SessionLog).where(SessionLog.record_status == RecordStatus.TO_BE_BUILT)
    ).all()

Benefits of SQLModel:

Type-safe database operations with IDE autocomplete and type checking
Automatic datetime serialization/deserialization
Relationship navigation (e.g., session_log.instrument_obj)
~50% less boilerplate code
Foundation for Alembic migrations (database schema version control)

Database Migrations with Alembic#

Added in version 2.2.0: NexusLIMS now uses Alembic for database schema version control and migrations.

Alembic provides a way to track and manage changes to the database schema over time, making it safe to upgrade existing installations and ensuring schema consistency across deployments.

For Existing Installations#

If you have an existing NexusLIMS database (created before version 2.2.0), you need to mark it as migrated to the baseline schema:

# Mark existing database as consistent with the baseline pre-2.0 schema
nexuslims db alembic stamp v1_4_3

This tells Alembic that your database already has the baseline schema structure and doesn’t need to be created from scratch.

Added in version 2.5.0: The nexuslims db command provides simple, user-friendly commands for common database operations. It automatically locates the migrations directory inside the installed package, making migrations work correctly after pip/uv installation. Advanced users can access the full Alembic CLI via nexuslims db alembic [COMMAND]. The uv run alembic command still works for development from source checkouts.

Common Migration Commands#

# Check current migration status
nexuslims db current

# Check if database has pending migrations
nexuslims db check

# View migration history
nexuslims db history

# Upgrade to latest schema version
nexuslims db upgrade

# Upgrade to specific revision
nexuslims db upgrade v2_5_0a

# Downgrade one migration
nexuslims db downgrade

# Generate a new migration (development only, requires source checkout)
nexuslims db alembic revision --autogenerate -m "Description"

For advanced Alembic operations, use nexuslims db alembic [COMMAND] to access all Alembic sub-commands. From a source checkout, you can also use uv run alembic directly.

Creating New Migrations#

When you modify the database schema (by changing SessionLog or Instrument, etc.), you should create a migration:

Modify the SQLModel classes in nexusLIMS/db/models.py or add new enums in nexusLIMS/db/enums.py
Determine the revision ID following the naming convention:
- Format: v[MAJOR]_[MINOR]_[PATCH][letter]_d[escription] (e.g., v2_5_0a_add_external_user_identifiers)
- Version number: Match the target release version where this schema change will land
- Letter suffix: Use sequential letters (a, b, c, …) for multiple migrations within the same version
- Description: Brief, snake_case description of the change

Generate migration script (requires source checkout):

nexuslims db alembic revision --autogenerate \
    -m "add external user identifiers" \
    --rev-id "v2_5_0a"

The migration will be created at:

nexusLIMS/db/migrations/versions/v2_5_0a_add_external_user_identifiers.py

Note: The filename combines the revision ID with the sanitized message for readability

Review and edit the generated script in nexusLIMS/db/migrations/versions/:
- Verify the upgrade() function creates the expected schema changes
- Verify the downgrade() function properly reverses those changes
- For CHECK constraints or enums, you should use hardcoded values in the migration (preserves historical accuracy, migration won’t break if enum changes later)
- Add docstring explaining the migration’s purpose if needed

Test the migration thoroughly:

# Apply migration
nexuslims db upgrade

# Test downgrade
nexuslims db downgrade

# Re-apply
nexuslims db upgrade head

Add integration tests in tests/integration/test_migrations.py:
- Test that upgrade creates expected tables/columns
- Test that downgrade removes them
- Test offline mode if migration has conditional logic

Create changelog fragment in docs/changes/:

# Format: {issue_number}.{change_type}.md
# Change types: feature, bugfix, doc, removal, misc
docs/changes/48.feature.md

Commit the migration script and tests to version control

Note

Revision ID Format

NexusLIMS uses version-based revision IDs with letter suffixes:

Format: vMAJOR_MINOR_PATCHletter (e.g., v2_5_0a, v2_5_0b, v2_4_0a)
Benefits: Clear association with release versions, sequential ordering within a version
Multiple migrations per version: Use sequential letters (a, b, c, …)
Manual specification: Use --rev-id flag when generating migrations

Migration Configuration#

Alembic configuration is stored in:

pyproject.toml under [tool.alembic] - Source code configuration (migration paths, etc.)
nexusLIMS/migrations/env.py - Migration environment setup (automatically reads NX_DB_PATH)
nexusLIMS/migrations/versions/ - Migration scripts directory (shipped in the package)

The database URL is automatically set from the NX_DB_PATH environment variable in env.py, so you don’t need to configure it separately. All Alembic configuration lives in pyproject.toml, eliminating the need for a separate alembic.ini file.

The nexuslims db CLI command automatically locates the migrations directory using importlib.resources, so migrations work correctly whether NexusLIMS is installed via pip, uv, or run from source.

Important Notes#

Always backup your database before running migrations on production data
Test migrations thoroughly in a development environment first
Never edit applied migrations - create a new migration to fix issues
The initial migration (v1_4_3_initial_schema_baseline.py) creates the basic database structure that serves as a basis for later migrations

Database Structure#

NexusLIMS Database Schema Diagram — NexusLIMS Database Schema (auto-generated from SQLModel metadata)#

Note

This diagram is automatically regenerated when you build the documentation, ensuring it always reflects the current database schema. Field descriptions are extracted from the SQLModel class docstrings.

For an interactive Mermaid diagram with detailed field descriptions and relationship documentation, see the Database Schema Diagram (also auto-generated).

The database contains four primary tables:

session_log - Tracks experimental sessions and record building status
- Records when users start/end experiments
- Tracks record building attempts and completion
- Populated by harvesters (e.g., nemo)
instruments - Stores authoritative instrument configuration
- Instrument names and PIDs
- Reservation system URLs
- Data storage paths
- Harvester configuration
upload_log - Tracks record uploads to CDCS and other export destinations
- Session identifier and instrument linkage
- CDCS record ID and PID
- Upload timestamps and success status
- Error tracking for failed uploads
external_user_identifiers - Maps NexusLIMS usernames to external system IDs
- Star-topology design with nexuslims_username as canonical identifier
- Supports NEMO, LabArchives ELN, LabArchives Scheduler, CDCS
- Bidirectional lookup between internal and external identities
- Tracks creation and verification timestamps

The `session_log` Table#

Purpose#

The session_log table tracks experimental sessions from start to finish. Harvesters (like nemo) populate this table by parsing reservation system APIs and creating timestamped event logs.

How It Works#

Session Events - Each row represents a timestamped event:
- START - User begins experiment
- END - User completes experiment
- RECORD_GENERATION - Record building attempted
Session Linking - Events are linked by session_identifier to represent a complete experimental session
Record Building Workflow:
- Harvester creates START and END events with status TO_BE_BUILT
- Back-end polls for sessions with TO_BE_BUILT status
- Record builder finds files created between start/end timestamps
- Status updated to COMPLETED or ERROR to prevent duplicates
- See record building for details

Table Schema#

The following columns define the session_log table structure:

Column	Data type	Description
`id_session_log`	INTEGER	The auto-incrementing primary key identifier for this table (just a generic number). Checks: must not be `NULL`
`session_identifier`	VARCHAR(36)	A unique string (could be a UUID) that is consistent among a single record’s `"START"`, `"END"`, and `"RECORD_GENERATION"` events. Checks: must not be `NULL`
`instrument`	VARCHAR(100)	The instrument PID associated with this session (this value is a foreign key reference to the `instruments` table). Checks: value must be one of those from the `instrument_pid` column of the `instruments` table.
`timestamp`	DATETIME	The date and time of the logged event in ISO timestamp format. Default: `strftime('%Y-%m-%dT%H:%M:%f', 'now', 'localtime')` Checks: must not be `NULL`
`event_type`	TEXT	The type of log for this session. Checks: must be one of `"START"`, `"END"`, or `"RECORD_GENERATION"`.
`record_status`	TEXT	The status of the record associated with this session. This value will be updated after a record is built for a given session. Default: `"WAITING_FOR_END"` Checks: must be one of: • `"WAITING_FOR_END"` - session has a start event, but no end event • `"TO_BE_BUILT"` - session has ended, but record not yet built • `"COMPLETED"` - record has been built successfully • `"ERROR"` - some error happened during record generation • `"NO_FILES_FOUND"` - record generation occurred, but no files matched time span • `"NO_CONSENT"` - user did not consent to data harvesting • `"NO_RESERVATION"` - no matching reservation found for this session
`user`	VARCHAR(50)	A username associated with this session (if known) – this value is not currently used by the back-end since it is not reliable across different instruments.

The `instruments` Table#

This table serves as the authoritative data source for the NexusLIMS back-end regarding information about the instruments in the Nexus Facility. By locating this information in an external database, changes to instrument configuration (or addition of a new instrument) requires making adjustments to just one location, simplifying maintenance of the system.

Back-end implementation details

When the nexusLIMS module is imported, one of the “setup” tasks performed is to perform a basic object-relational mapping between rows of the instruments table from the database into Instrument objects. These objects are stored in a dictionary attribute named nexusLIMS.instruments.instrument_db. This is done by querying the database specified in the environment variable NX_DB_PATH and creating a dictionary of Instrument objects that contain information about all of the instruments specified in the database. These objects are used widely throughout the code so that the database is only queried once at initial import, rather than every time information is needed.

Column	Data type	Description
`instrument_pid`	VARCHAR(100)	The unique identifier for an instrument in the facility, typically built from the make, model, and type of instrument, plus a unique numeric code (e.g. `Vendor-Model-Type-12345` )
`api_url`	TEXT	The calendar API endpoint url for this instrument’s scheduler. For NEMO, should be of the format `https://<nemo_address>/api/tools/?id=<tool_id>`
`calendar_url`	TEXT	The URL to this instrument’s web-accessible calendar
`location`	VARCHAR(100)	The physical location of this instrument
`display_name`	TEXT	The human-readable instrument name displayed in NexusLIMS records
`property_tag`	VARCHAR(20)	A unique numeric identifier for this instrument (not used by NexusLIMS, but for reference and potential future use)
`filestore_path`	TEXT	The path (relative to central storage location specified in NX_INSTRUMENT_DATA_PATH) where this instrument stores its data (e.g. `./instrument_name`)
`harvester`	TEXT	The specific submodule within `nexusLIMS.harvesters` that should be used to harvest reservation information for this instrument. Possible values: `nemo`.
`timezone`	TEXT	The timezone in which this instrument is located, in the format of the IANA timezone database (e.g. `America/New_York`). This is used to properly localize dates and times when communicating with the harvester APIs.

The `upload_log` Table#

Added in version 2.4.0: The upload_log table was added to track record exports to external repositories.

Purpose#

The upload_log table tracks export attempts to destination repositories like CDCS and LabArchives. It enables multi-destination export with granular success/failure tracking per destination and session.

How It Works#

Per-Destination Tracking - Each export attempt creates a new row, allowing one session to be exported to multiple destinations
Success/Failure Recording - Stores whether export succeeded and captures error messages if it failed
Record Linkage - Stores destination-specific record IDs and URLs for successful exports
Retry Logic - Failed exports can be identified and retried based on success status

Table Schema#

Column	Data type	Description
`id`	INTEGER	The auto-incrementing primary key identifier for this table. Checks: must not be `NULL`
`session_identifier`	VARCHAR(36)	Foreign key reference to `session_log.session_identifier`. Checks: must not be `NULL` Indexed: for efficient session lookup
`destination_name`	VARCHAR(100)	Name of the export destination (e.g., `"cdcs"`, `"labarchives"`). Checks: must not be `NULL` Indexed: for destination-specific queries
`success`	BOOLEAN	Whether the export succeeded (`TRUE`) or failed (`FALSE`). Checks: must not be `NULL`
`timestamp`	DATETIME	When the export attempt occurred (timezone-aware). Checks: must not be `NULL`
`record_id`	VARCHAR(255)	Destination-specific record identifier (e.g., CDCS document ID). Optional: `NULL` if export failed
`record_url`	VARCHAR(500)	Direct URL to view the exported record in the destination system. Optional: `NULL` if export failed
`error_message`	TEXT	Error message if export failed. Optional: `NULL` if export succeeded
`metadata_json`	TEXT	JSON-serialized dict with destination-specific metadata. Optional: for storing additional export context

The `external_user_identifiers` Table#

Added in version 2.5.0: The external_user_identifiers table was added to map NexusLIMS usernames to external system IDs.

Purpose#

The external_user_identifiers table maintains mappings between NexusLIMS usernames and user identifiers in external systems (NEMO, LabArchives ELN, LabArchives Scheduler, CDCS). This enables bidirectional user identity lookup and supports integration with external APIs.

How It Works#

Star Topology Design - Uses nexuslims_username (from session_log.user) as the canonical identifier, with mappings to each external system
Bidirectional Lookup - Supports both forward lookup (NexusLIMS → external ID) and reverse lookup (external ID → NexusLIMS)
Uniqueness Constraints:
- One external ID per system per NexusLIMS user
- One NexusLIMS user per external ID per system
Verification Tracking - Records when mappings are created and last verified for auditing

Helper Functions#

The nexusLIMS.db.models module provides convenience functions:

get_external_id() - Get external ID for a NexusLIMS username
get_nexuslims_username() - Reverse lookup from external ID
store_external_id() - Store or update a mapping (upsert)
get_all_external_ids() - Get all external IDs for a username

Table Schema#

Column	Data type	Description
`id`	INTEGER	The auto-incrementing primary key identifier for this table. Checks: must not be `NULL`
`nexuslims_username`	VARCHAR	Canonical username in NexusLIMS (from `session_log.user`). Checks: must not be `NULL` Indexed: for efficient lookup Unique Constraint: with `external_system`
`external_system`	TEXT	External system identifier. Checks: must be one of `"nemo"`, `"labarchives_eln"`, `"labarchives_scheduler"`, `"cdcs"` Unique Constraints: with `nexuslims_username` and with `external_id`
`external_id`	TEXT	User ID/username in the external system. Checks: must not be `NULL` Unique Constraint: with `external_system`
`email`	VARCHAR	User’s email for verification/matching purposes. Optional: may be `NULL`
`created_at`	DATETIME	When this mapping was created (timezone-aware, UTC). Checks: must not be `NULL` Default: current UTC timestamp
`last_verified_at`	DATETIME	Last time this mapping was verified (timezone-aware, UTC). Optional: `NULL` until first verification
`notes`	TEXT	Additional notes about this mapping (e.g., `"OAuth registration"`, `"From NEMO harvester"`). Optional: may be `NULL`

NexusLIMS Database#

Overview#

SQLModel ORM Migration#

Notable Changes#

Migration Examples#

Database Migrations with Alembic#

For Existing Installations#

Common Migration Commands#

Creating New Migrations#

Migration Configuration#

Important Notes#

Database Structure#

The session_log Table#

Purpose#

How It Works#

Table Schema#

The instruments Table#

The upload_log Table#

Purpose#

How It Works#

Table Schema#

The external_user_identifiers Table#

Purpose#

How It Works#

Helper Functions#

Table Schema#

This Page

The `session_log` Table#

The `instruments` Table#

The `upload_log` Table#

The `external_user_identifiers` Table#