Troubleshooting#

This guide covers common issues and their solutions for NexusLIMS-CDCS deployments.

Quick Diagnostics#

Before diving into specific issues, gather diagnostic information:

cd /path/to/NexusLIMS-CDCS/deployment
source admin-commands.sh

# Check service status
dc-prod ps

# View recent logs
dc-prod logs --tail=100

# Check system stats
admin-stats

Certificate Issues#

Problem: “Certificate not trusted” or SSL warnings#

In Development:

The development environment uses a local CA. Trust the certificate:

sudo security add-trusted-cert -d -r trustRoot \
  -k /Library/Keychains/System.keychain caddy/certs/ca.crt
sudo cp caddy/certs/ca.crt /usr/local/share/ca-certificates/nexuslims-dev-ca.crt
sudo update-ca-certificates

In Production (ACME/Let’s Encrypt):

  1. Verify DNS points to server:

    dig nexuslims.example.com
    
  2. Check Caddy logs for certificate errors:

    dc-prod logs caddy | grep -i cert
    
  3. Verify ports 80 and 443 are accessible:

    sudo netstat -tlnp | grep -E ':(80|443)'
    
  4. Test ACME challenge endpoint:

    curl http://nexuslims.example.com/.well-known/acme-challenge/test
    

Problem: Certificate renewal failing#

ACME certificates auto-renew. If renewal fails:

  1. Check Caddy data volume:

    docker volume inspect nexuslims_prod_caddy_data
    
  2. Force certificate renewal:

    dc-prod restart caddy
    
  3. Verify ACME email is set:

    grep CADDY_ACME_EMAIL .env
    

Database Issues#

Problem: “FATAL: password authentication failed”#

  1. Verify password in .env:

    grep POSTGRES_PASS .env
    
  2. Restart services:

    dc-prod restart
    
  3. If password was changed, you may need to reset the PostgreSQL data volume:

    dc-prod down -v  # WARNING: Deletes all data!
    dc-prod up -d
    

Problem: Database connection timeout#

  1. Check PostgreSQL is running:

    dc-prod ps postgres
    
  2. View PostgreSQL logs:

    dc-prod logs postgres
    
  3. Check container health:

    docker inspect nexuslims_prod_cdcs_postgres | grep -A 10 Health
    

Problem: Database disk full#

  1. Check disk usage:

    docker system df
    df -h
    
  2. Vacuum the database:

    docker exec nexuslims_prod_cdcs_postgres vacuumdb -U nexuslims --all --full
    
  3. Clean up Docker resources:

    docker system prune -f
    

File Serving Issues#

Problem: 404 errors on file server#

  1. Verify file paths in .env:

    echo "Data path: $NX_DATA_HOST_PATH"
    echo "Instrument data path: $NX_INSTRUMENT_DATA_HOST_PATH"
    
  2. Check directories exist and have correct permissions:

    ls -la $NX_DATA_HOST_PATH
    ls -la $NX_INSTRUMENT_DATA_HOST_PATH
    
  3. Verify mounts in container:

    docker exec nexuslims_prod_cdcs ls -la /srv/nx-data
    docker exec nexuslims_prod_cdcs ls -la /srv/nx-instrument-data
    

Problem: Permission denied errors#

  1. Check directory ownership:

    ls -la /mnt/nexuslims/
    
  2. Fix permissions (if needed):

    sudo chown -R $USER:$USER /mnt/nexuslims/
    sudo chmod -R 755 /mnt/nexuslims/
    

Problem: Files not appearing after mount change#

Restart services after changing mount paths:

dc-prod down
dc-prod up -d

Application Issues#

Problem: Container exits immediately#

  1. Check logs:

    dc-prod logs cdcs
    
  2. Verify environment variables:

    dc-prod config
    
  3. Check health status:

    docker inspect nexuslims_prod_cdcs | grep -A 10 Health
    

Problem: Application returns 500 errors#

  1. Check application logs:

    dc-prod logs -f cdcs
    
  2. Enable Django debug mode temporarily:

    # In .env
    DJANGO_DEBUG=True
    
    # Then restart
    dc-prod restart cdcs
    

    Warning

    Remember to disable debug mode after troubleshooting!

  3. Check database connectivity:

    docker exec nexuslims_prod_cdcs python manage.py check
    

Problem: Slow response times#

  1. Check resource usage:

    docker stats
    
  2. Check database performance:

    docker exec nexuslims_prod_cdcs_postgres psql -U nexuslims -d nexuslims \
      -c "SELECT pid, query, state FROM pg_stat_activity WHERE state != 'idle';"
    
  3. Optimize database:

    docker exec nexuslims_prod_cdcs_postgres vacuumdb -U nexuslims --all --full --analyze
    
  4. Consider increasing Gunicorn workers:

    # In .env
    GUNICORN_WORKERS=8
    GUNICORN_THREADS=4
    

XSLT Issues#

Problem: XSLT changes not appearing#

XSLT stylesheets are stored in the database. After editing .xsl files:

Development:

source dev-commands.sh
dev-update-xslt

Production:

docker exec nexuslims_prod_cdcs bash /srv/scripts/update-xslt.sh

Then refresh your browser (clear cache if needed).

Problem: Wrong URLs in rendered HTML#

  1. Check XSLT URL configuration:

    grep XSLT_ .env
    
  2. Re-upload XSLT with correct URLs:

    docker exec nexuslims_prod_cdcs bash /srv/scripts/update-xslt.sh
    

Problem: XSLT parsing errors#

Check the XSL file for valid XML:

xmllint --noout xslt/detail_stylesheet.xsl
xmllint --noout xslt/list_stylesheet.xsl

Backup Issues#

Problem: Permission denied when creating backup#

The backup directory must be owned by the Docker user:

sudo chown $USER:$USER /opt/nexuslims/backups

On macOS, use $USER:staff:

sudo chown $USER:staff /opt/nexuslims/backups

Problem: Backup directory not found in container#

Ensure the backup path is mounted. Check .env:

grep NX_CDCS_BACKUPS_HOST_PATH .env

Restart services after setting:

dc-prod down
dc-prod up -d

Problem: Restore fails with “file not found”#

The restore command expects the container path, not host path. Use admin-restore which handles path conversion:

source admin-commands.sh
admin-restore /opt/nexuslims/backups/backup_20260115_143022

Network Issues#

Problem: Port already in use#

  1. Find what’s using the port:

    sudo lsof -i :80
    sudo lsof -i :443
    
  2. Stop the conflicting service or change ports in .env:

    # Use alternative ports
    HTTP_PORT=8080
    HTTPS_PORT=8443
    

Problem: Cannot access from other machines#

  1. Check firewall rules:

    sudo ufw status  # Ubuntu
    sudo firewall-cmd --list-all  # RHEL
    
  2. Verify ALLOWED_HOSTS includes the hostname/IP:

    grep ALLOWED_HOSTS .env
    

Common Error Messages#

“OperationalError: FATAL: database does not exist”#

The database hasn’t been created. Either:

  1. Start fresh: dc-prod up -d (creates database on first run)

  2. Or restore from backup: admin-db-restore <backup.sql>

“DisallowedHost at /”#

Add the hostname to ALLOWED_HOSTS in .env:

ALLOWED_HOSTS=nexuslims.example.com,www.nexuslims.example.com

“CSRF verification failed”#

Add the URL to CSRF_TRUSTED_ORIGINS in .env:

CSRF_TRUSTED_ORIGINS=https://nexuslims.example.com

“No module named ‘config.settings.xxx’”#

Verify DJANGO_SETTINGS_MODULE in .env points to a valid settings file:

DJANGO_SETTINGS_MODULE=config.settings.prod_settings

Getting Help#

If you can’t resolve an issue:

  1. Gather diagnostic information:

    dc-prod ps
    dc-prod logs --tail=500 > logs.txt
    admin-stats
    
  2. Check for similar issues: datasophos/NexusLIMS-CDCS#issues

  3. Open a new issue with:

    • Description of the problem

    • Steps to reproduce

    • Relevant log output

    • Environment details (OS, Docker version, etc.)

  4. Professional support is available from Datasophos