Integration Testing Guide#
This guide provides comprehensive documentation for NexusLIMS integration tests, which validate end-to-end workflows using real Docker services instead of mocks.
Overview#
NexusLIMS integration tests verify that the complete system works together correctly, from NEMO reservation harvesting through record building to CDCS upload. These tests use Docker Compose to orchestrate a complete service stack that mirrors the production environment.
Why Integration Testing?#
Integration tests provide:
End-to-End Validation: Verify workflows work across multiple components
Real Service Integration: Test actual NEMO API, CDCS REST API, and file operations
Regression Detection: Catch breaking changes in component interactions
Production Confidence: High assurance that deployments will work
When to Use Integration Tests vs Unit Tests#
Aspect |
Unit Tests |
Integration Tests |
|---|---|---|
Speed |
Very fast (seconds) |
Slower (potentially minutes) |
Isolation |
Mocked dependencies |
Real services |
Coverage |
Internal logic |
External interactions |
Frequency |
Run on every commit |
Run nightly or before merge |
Environment |
Local without Docker |
Requires Docker |
Rule of Thumb: Unit tests for logic, integration tests for interactions.
Architecture#
Service Stack#
The integration test environment includes:
graph TB
subgraph "Test Runner"
A["pytest<br/>Integration Tests"]
end
subgraph "Reverse Proxy"
B["Caddy<br/>port 80<br/>nemo.localhost<br/>cdcs.localhost<br/>mailpit.localhost<br/>fileserver.localhost"]
end
subgraph "NEMO"
C["NEMO Service<br/>port 8000<br/>Django + SQLite"]
end
subgraph "CDCS"
D["CDCS<br/>port 8080<br/>Django + uWSGI"]
E["PostgreSQL<br/>Django DB"]
F["MongoDB<br/>Record Storage"]
G["Redis<br/>Celery Queue"]
end
subgraph "File Serving"
H["Fileserver<br/>port 8081<br/>pytest HTTP server fixture"]
end
subgraph "Email Capture"
I["MailPit SMTP<br/>port 1025<br/>Web UI: 8025"]
end
A --> B
B --> C
B --> D
B --> H
B --> I
D --> E
D --> F
D --> G
style A fill:#e3f2fd
style B fill:#fff9c4
style C fill:#f3e5f5
style D fill:#f3e5f5
style E fill:#ede7f6
style F fill:#ede7f6
style G fill:#ede7f6
style H fill:#e8f5e9
style I fill:#fce4ec
Data Flow#
graph TD
A["NEMO Reservation"] --> B["NEMO Harvester"]
B --> C["Session Log"]
C --> D["Record Builder"]
D --> E["File Discovery"]
F["Microscopy Files<br/>via Fileserver"] --> E
E --> G["Metadata Extraction"]
G --> H["XML Generation"]
H --> I["CDCS Upload"]
I --> J["Queryable Records"]
style A fill:#e1f5ff
style D fill:#fff3e0
style I fill:#f3e5f5
style J fill:#e8f5e9
Setup and Configuration#
Prerequisites#
Docker and Docker Compose 2.0+
uvpackage manager or Python 3.11+At least 4GB available RAM for Docker
Ports 8000, 8025, 8080, 8081, 1025 available
First Time Setup#
Clone the repository:
git clone https://github.com/datasophos/NexusLIMS.git cd NexusLIMS
Install dependencies:
uv syncStart Docker services:
cd tests/integration/docker docker compose up -d # Wait for services to be healthy docker compose ps # Check STATUS column
Verify service connectivity:
# NEMO (via Caddy reverse proxy) curl http://nemo.localhost/ # Should return HTML # CDCS (via Caddy reverse proxy) curl http://cdcs.localhost/ # Should return HTML # Mailpit (via Caddy reverse proxy) curl http://mailpit.localhost/ # Should return directory listing # Fileserver (via Caddy reverse proxy) # the fileserver only runs while the tests are actually running, # so the URL below will not be available unless tests are running curl http://fileserver.localhost/
Environment Configuration#
Fixtures automatically patch configuration variables through nexusLIMS.config, so there’s no envrionment configuration necessary.
Running Integration Tests#
Docker Service Management#
Integration tests automatically manage Docker services through pytest fixtures. The docker_services fixture handles the complete lifecycle:
Startup: Services start automatically when first integration test runs
Health Checks: Waits for services to be healthy (NEMO, CDCS, MailPit, etc.)
Teardown: Services automatically stop and cleanup after all tests complete
By default, Docker services are:
Started once per test session (session-scoped fixture)
Automatically cleaned up after tests finish
Have volumes removed to prevent state carryover
Keeping Docker Services Running (Development)#
For development and debugging, you can keep Docker services running between test runs by setting an environment variable. This speeds up the setup phase of the tests so you don’t have to wait for the Docker stack to start and stop for every test run:
# Set this before running tests to keep services up after test completion
export NX_TESTS_KEEP_DOCKER_RUNNING=1
# Run tests - services will stay running after completion
uv run pytest tests/integration/ -v
# Services now available for manual testing/inspection
docker compose -f tests/integration/docker/docker-compose.yml ps
# Manually stop when done
docker compose -f tests/integration/docker/docker-compose.yml down -v
Benefits of keeping services running:
Faster iteration during development (no startup overhead)
Inspect service logs and state between runs
Manually test APIs with curl or Postman
Reproduce issues without full test overhead
Configure via .env.test (Optional)#
You can optionally configure integration test behavior via a
.env.test file in the repository root. See .env.test.example
for available configuration options. Currently the only option
is the NX_TESTS_KEEP_DOCKER_RUNNING setting.
Quick Start#
# From repository root
uv run pytest tests/integration/ -v
Common Commands#
# Run all integration tests with coverage
uv run pytest tests/integration/ -v --cov=nexusLIMS
# Run specific test file
uv run pytest tests/integration/test_nemo_integration.py -v
# Run with print statements visible
uv run pytest tests/integration/ -v -s
Running Without Docker#
If you only want to run unit tests (which don’t require Docker):
# Unit tests only (default)
uv run pytest tests/unit/ -v
Test Organization#
Test Files#
File |
Purpose |
Test Count |
|---|---|---|
|
NEMO API and harvester |
35+ |
|
CDCS upload and retrieval |
20+ |
|
Complete workflows |
3+ |
|
Error handling |
6+ |
|
CLI script testing |
8+ |
|
Multi-NEMO support |
16+ |
|
Fixture validation |
20+ |
|
File serving |
2+ |
Key Integration Test Patterns#
1. NEMO Integration Tests#
The nemo_client fixture provides connection information for the NEMO Docker instance:
nemo_client["url"]: NEMO API base URL (e.g.,http://nemo.localhost/api/)nemo_client["token"]: Authentication token for API requestsnemo_client["timezone"]: Timezone string for datetime handling (e.g.,"America/New_York")
@pytest.mark.integration
def test_nemo_connector_fetches_users(nemo_client):
"""Test fetching users from NEMO API."""
from nexusLIMS.harvesters.nemo.connector import NemoConnector
connector = NemoConnector(
base_url=nemo_client["url"],
token=nemo_client["token"],
timezone=nemo_client["timezone"]
)
users = connector.get_all_users()
assert len(users) > 0
assert any(u["username"] == "captain" for u in users)
Testing NEMO Usage Event Questions#
NEMO usage events can contain experiment metadata in two JSON-encoded fields:
run_data: Questions answered at the end of instrument usage (highest priority)pre_run_data: Questions answered at the start of instrument usage (medium priority)
The harvester implements a three-tier fallback strategy (run_data → pre_run_data → reservation matching) to obtain the most accurate metadata. Integration tests verify this behavior using test usage events (IDs 100-106) seeded in the NEMO Docker instance.
Test usage events in seed_data.json:
Event ID |
|
|
Test Purpose |
|---|---|---|---|
100 |
Valid questions |
Empty |
Tests Priority 1: run_data |
101 |
Empty |
Valid questions |
Tests Priority 2: pre_run_data |
102 |
Valid questions |
Valid questions |
Tests run_data priority over pre_run_data |
103 |
Empty |
“Disagree” consent |
Tests consent validation and fallback |
104 |
Missing user_input fields |
Empty |
Tests graceful handling of incomplete data |
105 |
Empty strings |
Empty strings |
Tests fallback to reservation matching |
106 |
Malformed JSON |
Malformed JSON |
Tests JSON parsing error handling |
Example test:
@pytest.mark.integration
def test_usage_event_with_run_data(test_instrument, nemo_connector):
"""Verify run_data is used when populated."""
from nexusLIMS.db.session_handler import Session
from nexusLIMS.harvesters.nemo import res_event_from_session
# Create session for usage event 100 (has run_data)
session = Session(
instrument=test_instrument,
session_identifier="http://nemo.localhost/api/usage_events/100/",
dt_from=datetime(2024, 7, 1, 10, 0, tzinfo=timezone.utc),
dt_to=datetime(2024, 7, 1, 12, 0, tzinfo=timezone.utc),
user="captain",
)
res_event = res_event_from_session(session, nemo_connector)
# Verify metadata came from run_data (not reservation)
assert res_event.experiment_title == "Au-TiO2 characterization"
assert res_event.experiment_purpose == "Measuring particle size distribution"
assert "http://nemo.localhost/event_details/usage/100/" in res_event.url
Test coverage includes:
Three-tier priority ordering (run_data > pre_run_data > reservation)
Data consent validation and rejection
JSON parsing error handling
Empty/missing field fallback behavior
Operator vs. user field handling
Helper function validation (
has_valid_question_data())
See TestNemoUsageEventQuestions class in tests/integration/test_nemo_integration.py for complete test suite.
2. CDCS Integration Tests#
The cdcs_client fixture provides connection information and utilities for the CDCS Docker instance:
cdcs_client["url"]: CDCS base URL (e.g.,http://cdcs.localhost/)cdcs_client["username"]: Authentication username for CDCS APIcdcs_client["password"]: Authentication password for CDCS APIcdcs_client["register_record"](record_id): Register a record ID for automatic cleanup after testcdcs_client["created_records"]: List of all registered record IDs
@pytest.mark.integration
def test_cdcs_record_upload(cdcs_client):
"""Test uploading and retrieving records from CDCS."""
import nexusLIMS.cdcs as cdcs
xml_content = '''<?xml version="1.0" encoding="UTF-8"?>
<Experiment>...</Experiment>'''
record_id = cdcs.upload_record_content(xml_content, "Test Record")
cdcs_client["register_record"](record_id) # Auto-cleanup after test
assert record_id is not None
3. End-to-End Workflow Tests#
The test_environment_setup fixture configures a complete end-to-end test environment with all services and test data:
test_environment_setup["instrument_pid"]: Test instrument ID (e.g.,"FEI-Titan-TEM")test_environment_setup["dt_from"]: Expected session start datetimetest_environment_setup["dt_to"]: Expected session end datetimetest_environment_setup["user"]: Expected username for test sessiontest_environment_setup["instrument_db"]: Configured test instrument databasetest_environment_setup["cdcs_client"]: CDCS client configuration
This fixture automatically:
Starts all Docker services (NEMO, CDCS, MailPit, fileserver)
Configures NEMO harvester with test data
Sets up test database with instruments
Extracts test microscopy files
Configures CDCS client for uploads
@pytest.mark.integration
def test_complete_record_building(test_environment_setup):
"""Test complete NEMO → Record Builder → CDCS workflow."""
from nexusLIMS.harvesters.nemo.utils import add_all_usage_events_to_db
from nexusLIMS.builder.record_builder import process_new_records
# Harvest from NEMO
add_all_usage_events_to_db()
# Build and upload records
process_new_records()
# Verify records in CDCS
# ... verification ...
4. Error Handling Tests#
The nemo_connector fixture provides a pre-configured NemoConnector instance for testing. It differs from nemo_client in that:
nemo_client: Returns a dict with connection information (URL, token, timezone) - use when you need to manually create a connector or test connection parametersnemo_connector: Returns a ready-to-useNemoConnectorinstance configured with test database and NEMO client settings - use when you just need a working connector
@pytest.mark.integration
def test_nemo_connection_failure(nemo_connector, monkeypatch):
"""Test graceful handling of NEMO connection failures."""
from nexusLIMS.harvesters.nemo.utils import add_all_usage_events_to_db
# Simulate network error
monkeypatch.setattr(
"requests.get",
side_effect=requests.ConnectionError("Network error")
)
# Should handle gracefully
with pytest.raises(requests.ConnectionError):
add_all_usage_events_to_db()
Debugging Integration Tests#
View Service Logs#
cd tests/integration/docker
# View logs from all services
docker compose logs
# View logs from specific service
docker compose logs nemo
docker compose logs cdcs
docker compose logs mailpit
# Follow logs in real-time
docker compose logs -f nemo
# Show last 100 lines
docker compose logs --tail=100
Access Service Web UIs#
NEMO: http://nemo.localhost (or http://localhost:8000)
CDCS: http://cdcs.localhost (or http://localhost:8080) – this can be useful to inspect records during/after tests
MailPit: http://mailpit.localhost (or http://localhost:8025)
Fileserver: http://fileserver.localhost/data (or http://localhost:8081/data)
Use Standalone Fileserver#
For debugging file serving issues:
python tests/integration/debug_fileserver.py
This starts the same fileserver used in tests on port 8081 for manual testing.
Troubleshooting#
Services Fail to Start#
Check Docker daemon:
docker ps # Should list running containers
Check service logs:
cd tests/integration/docker
docker compose logs nemo
docker compose logs cdcs
Common causes:
Ports already in use:
lsof -i :8000Insufficient Docker resources (Docker Desktop settings)
Previous containers not cleaned:
docker compose down -v
Health Checks Timeout#
Increase timeout in conftest.py:
# Change this value (in seconds)
HEALTH_CHECK_TIMEOUT = 300 # Increased from 180
Or skip health checks in development:
docker compose up -d --no-health # Not recommended for CI
Tests Fail with “Connection Refused”#
Ensure services are running:
docker compose ps
# STATUS should show "healthy" or "running"
If not healthy, restart:
docker compose down -v
docker compose up -d
# Wait for health checks to pass
Database Locks#
If tests hang on database operations:
Stop all tests:
Ctrl+CClean up:
rm /tmp/nexuslims-test.db*Restart services:
docker compose down -v && docker compose up -d
CDCS Upload Failures#
Check credentials:
# Should return 200 status with some workspace data
curl -u admin:admin http://cdcs.localhost/rest/workspace/
Check XML validity:
Use
xmllint:xmllint --schema schema.xsd record.xmlValidate in CDCS web UI
Cleanup Issues#
Manual cleanup:
# Stop all services
docker compose down
# Remove volumes
docker volume prune -f
# Remove test data
rm -rf /tmp/nexuslims-test-*
# Clean Docker system
docker system prune -a --volumes
Best Practices#
1. Always Use Fixtures#
# Good - uses fixtures
def test_something(nemo_client, cdcs_client):
# ...
# Bad - hardcoded URLs
def test_something():
requests.get("http://localhost:8000/") # Don't do this
2. Mark Tests Properly#
# Good
@pytest.mark.integration
def test_complete_workflow(test_environment_setup):
# ...
# Bad - missing integration marker
def test_something():
# ...
3. Use Descriptive Names#
# Good
def test_nemo_harvester_creates_session_for_usage_event():
# ...
# Bad
def test_harvester():
# ...
4. Clean Up Resources#
# Good - use cdcs_client fixture
def test_upload(cdcs_client):
record_id = cdcs.upload_record_content(xml, "Test")
cdcs_client["register_record"](record_id) # Auto-cleanup
# Bad - manual cleanup required
def test_upload():
record_id = cdcs.upload_record_content(xml, "Test")
# No cleanup = test pollution
5. Test One Thing Per Test#
# Good - tests single behavior
def test_nemo_connector_retrieves_users():
# Only test user retrieval
# Bad - tests multiple behaviors
def test_nemo_connector_everything():
# Tests users, tools, projects, and reservations
Performance Optimization#
Session-Scoped Fixtures#
Services start once per test session (not per test):
# conftest.py
@pytest.fixture(scope="session")
def docker_services():
# Starts once, runs for entire session
# ...
This means services stay running across all tests, greatly improving performance.
Selective Service Startup#
If only testing specific components:
cd tests/integration/docker
docker compose up -d nemo # Only start NEMO
CI/CD Integration#
Integration tests run automatically in GitHub Actions:
Trigger: Every push to
mainor feature branchesSchedule: Nightly at 3 AM UTC
Environment: Ubuntu latest with Docker
Timeout: 600 seconds per test
Coverage: Reported to Codecov with
integrationflag
Running in GitHub Actions#
Tests use pre-built images from GitHub Container Registry when available, falling back to local builds.
Workflow file: .github/workflows/integration-tests.yml
Adding New Integration Tests#
Template#
"""
Integration tests for [feature].
This module tests [what functionality] by interacting with real
Docker services instead of mocks.
"""
import pytest
@pytest.mark.integration
class Test[FeatureName]:
"""Integration tests for [feature]."""
def test_[specific_behavior](self, [required_fixtures]):
"""
Test [what you're testing].
This test verifies that:
1. [Behavior one]
2. [Behavior two]
3. [Expected outcome]
Parameters
----------
[fixture_name] : [type]
Description of fixture
"""
# Arrange
# ... setup ...
# Act
# ... execute feature ...
# Assert
# ... verify results ...
Checklist#
☐ Module docstring explains what’s being tested
☐ Class docstring summarizes test scope
☐ Each test has clear docstring with Parameters section
☐ Test is marked with
@pytest.mark.integration☐ Test name is descriptive (not just “test_something”)
☐ Test follows Arrange-Act-Assert pattern
☐ Test cleans up resources (use fixtures for this)
☐ Test is independent (no order dependencies)
☐ Test uses fixtures instead of hardcoded values
Further Reading#
Tests Integration README: Quick reference guide in
tests/integration/README.mdDocker Services Documentation: Service details in
tests/integration/docker/README.mdShared Test Fixtures: Available fixtures (see
tests/fixtures/shared_data.py)
Support#
For issues or questions:
Check the readme in
tests/integration/README.mdReview test logs:
docker compose logsSearch GitHub Issues
Open a new issue with logs and reproduction steps