Skip to content

Deployment and Release Guide

Audience: maintainers operating deployments and release rollouts.

Complete reference for CI/CD pipelines and NERSC Spin deployments.

Table of Contents

Overview

SimBoard uses GitHub Actions to automatically build and publish container images to the NERSC container registry (registry.nersc.gov/e3sm/simboard/).

Key Features:

  • ✅ Automated dev builds from main branch
  • ✅ Component-level production releases via GitHub Releases
  • ✅ Independent frontend and backend versioning
  • ✅ linux/amd64 architecture support
  • ✅ Semantic versioning for production
  • ✅ Docker Buildx with layer caching
  • ✅ Separation via image tags and K8s namespaces

Environment Architecture

Development

Component Hosting Image Pull Policy
Backend NERSC Spin (dev) backend:dev Always
Frontend NERSC Spin (dev) frontend:dev Always

Trigger: Automatic on push to main

Production

Component Hosting Image Pull Policy
Backend NERSC Spin (prod) backend:X.Y.Z IfNotPresent
Frontend NERSC Spin (prod) frontend:X.Y.Z IfNotPresent

Trigger: Component-scoped GitHub Release tag (for example, backend-vX.Y.Z, frontend-vX.Y.Z)

Note: Frontend and backend are versioned independently. Each component can be released on its own schedule without affecting the other.

CI/CD Workflows

Dev Builds (push to main)

Dev workflows build and push images tagged with :dev and :sha-<commit> whenever changes are pushed to main. These do not affect production images.

Backend Dev Workflow

Triggers: Push to main (backend changes) or manual dispatch

Tags: :dev, :sha-<commit>

Registry: registry.nersc.gov/e3sm/simboard/backend

Frontend Dev Workflow

Triggers: Push to main (frontend changes) or manual dispatch

Tags: :dev, :sha-<commit>

Build args:

  • VITE_API_BASE_URL: https://simboard-dev-api.e3sm.org (default)

Registry: registry.nersc.gov/e3sm/simboard/frontend

Release Builds (component-scoped tags)

Release workflows are triggered by component-scoped Git tags created through GitHub Releases. Each component has its own workflow and tag namespace. Release builds do not modify the :dev image.

Backend Prod Workflow

Triggers: Tag push matching backend-v*

Tags: :X.Y.Z, :sha-<commit>, :latest

Registry: registry.nersc.gov/e3sm/simboard/backend

Frontend Prod Workflow

Triggers: Tag push matching frontend-v*

Tags: :X.Y.Z, :sha-<commit>, :latest

Build args:

  • VITE_API_BASE_URL: https://simboard-api.e3sm.org (default, override in manual dispatch)

Registry: registry.nersc.gov/e3sm/simboard/frontend

Build Flow Summary

Dev builds:     push to main     → :dev, :sha-<short>
Release builds: component tag    → :X.Y.Z, :sha-<short>, :latest

GitHub Secrets Setup

Required secrets: Configure in repository settings

  1. NERSC_REGISTRY_USERNAME
  2. Your NERSC username
  3. Used for docker login registry.nersc.gov

  4. NERSC_REGISTRY_PASSWORD

  5. Your NERSC password or access token
  6. Used for docker login registry.nersc.gov

Test locally:

docker login registry.nersc.gov
# Use the same credentials

Security:

  • Use service account tokens when available
  • Rotate credentials on a schedule that matches current NERSC and project policy
  • Never commit credentials to source code

Image Tagging Strategy

Development Images

Tag Description Use Case
:dev Latest from main Primary dev deployment
:sha-a1b2c3d Specific commit Debugging, rollback

Production Images

Tag Description Use Case
:X.Y.Z Full version Production (recommended)
:latest Latest release Reference only

Best practice: Use full semantic versions (:X.Y.Z) in production for reproducibility.

Tag Convention

Git Tag Component Docker Image Tag
backend-vX.Y.Z Backend registry.nersc.gov/e3sm/simboard/backend:X.Y.Z
frontend-vX.Y.Z Frontend registry.nersc.gov/e3sm/simboard/frontend:X.Y.Z

Development Deployment

Update Dev Environment

Development images are automatically built and pushed when you push to main. To deploy the updated images on NERSC Spin, use the Rancher UI:

  1. Navigate to Workloads → Deployments in the dev namespace
  2. Find the backend or frontend deployment
  3. Click ⋮ → Redeploy to pull the latest :dev image
  4. Verify pods restart successfully in the Pods tab

Image Configuration

When creating or editing a workload in Rancher, set these values:

Dev backend:

  • Image: registry.nersc.gov/e3sm/simboard/backend:dev
  • Pull Policy: Always

Dev frontend:

  • Image: registry.nersc.gov/e3sm/simboard/frontend:dev
  • Pull Policy: Always

Production Release Process

Frontend and backend are released independently using component-scoped tags. Creating a GitHub Release with the appropriate tag triggers the corresponding CI workflow.

Step 1: Prepare Release

# Ensure main is up to date
git checkout main && git pull

# Run tests
make backend-test
make frontend-lint

Step 2a: Create GitHub Release (Frontend)

  1. Navigate to Releases
  2. Click Draft a new release
  3. In Choose a tag, enter a new tag following the convention:
frontend-vX.Y.Z
  1. Ensure the Target branch is main
  2. Set the release title (for example, Frontend vX.Y.Z)
  3. Add release notes summarizing the changes
  4. Click Publish release

Publishing the release creates the Git tag, which:

  • Triggers the frontend release workflow for frontend-v* tags
  • Builds the Docker image
  • Pushes versioned tags (:X.Y.Z, :sha-<short>, :latest) to the registry
  • Does not modify the :dev image

Step 2b: Create GitHub Release (Backend)

Follow the same steps as above, but use a backend-scoped tag:

backend-vX.Y.Z

This triggers the backend release workflow and pushes backend-specific versioned tags.

Step 3: Monitor Builds

Check the Actions tab and confirm that only the workflow matching the component tag triggers.

Step 4: Deploy to Production

Update the image tags in the Rancher UI:

  1. Navigate to Workloads → Deployments in the prod namespace
  2. Click the target deployment → ⋮ → Edit Config
  3. Update the Image field to the new versioned image, e.g.:
  4. Backend: registry.nersc.gov/e3sm/simboard/backend:X.Y.Z
  5. Frontend: registry.nersc.gov/e3sm/simboard/frontend:X.Y.Z
  6. Set Pull Policy to IfNotPresent
  7. Click Save — Rancher will roll out the new version

For backend releases, migrations run automatically in a backend initContainer during rollout. See Database Migrations.

Step 5: Verify Production

  1. In Rancher, check that pods are Running under Workloads → Pods in the prod namespace
  2. Review pod logs via the ⋮ → View Logs action in Rancher
  3. Test endpoints:
  4. https://simboard-api.e3sm.org/api/v1/health
  5. https://simboard.e3sm.org/health

Database Migrations

Database migrations are executed by a backend Deployment initContainer during rollout, not on backend app startup.

Runtime Behavior

  • Backend container starts the API directly and does not run migrations at startup.
  • InitContainer runs before backend container start and executes /app/migrate.sh.
  • migrate.sh validates DATABASE_URL, waits for DB readiness, then runs alembic upgrade head by default.

Spin Workloads

Reference runbook:

  • docs/deploy/nersc-spin-runbook.md

  • Backend service/deployment baseline is defined for in-cluster API routing (backend on 8000).

  • Backend Deployment uses the image entrypoint directly (no app args required).
  • Backend Deployment includes initContainer migrate using the same backend image tag to run Alembic before app start.
  • Frontend service/deployment baseline is defined for UI routing (frontend on 80).
  • Frontend Deployment uses the frontend image default CMD (no explicit args).
  • DB service/deployment baseline is defined for in-cluster Postgres (db).
  • Ingress baseline (lb) terminates TLS via simboard-tls-cert and routes frontend/backend hosts.
  • Backend and migration initContainer env values are sourced via envFrom from secret simboard-backend-env.
  • DB container env values are sourced via envFrom from secret simboard-db.

Deployment Order (Required)

  1. Roll out backend deployment with the target image tag.
  2. Wait for initContainer migration step to succeed.
  3. Confirm backend pods become Running and Ready.

If initContainer migration fails, backend pods will not become ready and rollout should be treated as failed.

Concurrency Note

InitContainers run per pod. If more than one backend pod is created simultaneously, migrations may execute concurrently.

Use an explicit rollout strategy that guarantees only one new pod (and therefore one migration initContainer) is created at a time:

spec:
  replicas: 1
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1

Why this is required: with default RollingUpdate settings, Kubernetes may create a surge pod during updates, which can run a second migration initContainer even when the steady-state replica count is 1.

If you need replicas > 1, use a DB-level migration lock so only one initContainer can run Alembic at a time. For PostgreSQL, wrap migration execution with a single transaction-scoped advisory lock (for example, SELECT pg_advisory_lock(<fixed_key>); ... alembic upgrade head ...; SELECT pg_advisory_unlock(<fixed_key>);).

Production-safe recommendation: apply both controls (serialized rollout strategy plus DB-level lock) for defense in depth.

Rollback Caveat

Rolling back the backend container image does not roll back database schema automatically. Use backward-compatible migrations (expand/contract pattern), and use a separate, explicit rollback migration only when needed.

Rollback Procedure

Version-tagged images are immutable — once published, a version tag (for example, :X.Y.Z) always refers to the same image. This makes rollbacks safe and predictable.

Rolling Back via Rancher

  1. Open the Rancher UI
  2. Navigate to Workloads → Deployments in the prod namespace
  3. Click the deployment to roll back → ⋮ → Edit Config
  4. Change the Image tag to the previous known-good version, e.g.:
  5. registry.nersc.gov/e3sm/simboard/backend:X.Y.Z
  6. registry.nersc.gov/e3sm/simboard/frontend:X.Y.Z
  7. Click Save to trigger the rollout

Alternatively, use the built-in Rancher rollback:

  1. Navigate to the deployment → ⋮ → Rollback
  2. Select the previous revision and confirm

Key Rollback Principles

  • Version tags are immutable: A published :X.Y.Z tag always points to the same image digest. You can safely redeploy any previously released version.
  • Components are independent: Rolling back the frontend does not require rolling back the backend, and vice versa.
  • :dev is unaffected: Release rollbacks have no impact on the dev environment.
  • Use commit-based tags for precision: If you need to deploy a specific build, use the :sha-<short> tag from the GitHub Actions build log.

Manual Builds

For testing or emergency builds, you can manually build and push images using Docker Buildx. This is not recommended for regular use, as it bypasses CI checks and versioning conventions.

First login to the NERSC registry:

docker login registry.nersc.gov

Backend

cd backend
docker buildx build \
  --platform=linux/amd64,linux/arm64 \
  --build-arg ENV=production \
  -t registry.nersc.gov/e3sm/simboard/backend:dev-manual \
  --push \
  .

Frontend (with API URL override)

# Development
cd frontend
docker buildx build \
  --platform=linux/amd64,linux/arm64 \
  --build-arg VITE_API_BASE_URL=https://simboard-dev-api.e3sm.org \
  -t registry.nersc.gov/e3sm/simboard/frontend:dev-manual \
  --push \
  .
# Production
cd frontend
docker buildx build \
  --platform=linux/amd64,linux/arm64 \
  --build-arg VITE_API_BASE_URL=https://simboard-api.e3sm.org \
  -t registry.nersc.gov/e3sm/simboard/frontend:prod-manual \
  --push \
  .

Troubleshooting

Authentication Failures

Issue: denied: requested access to the resource is denied

Solutions:

  1. Verify GitHub Secrets are configured
  2. Test credentials: docker login registry.nersc.gov
  3. Check NERSC account has push permissions to e3sm/simboard/ namespace

Build Failures

Issue: Workflow fails during build

Solutions:

  1. Check workflow logs in Actions tab
  2. Test Dockerfile locally:
cd backend && docker build .
cd frontend && docker build --build-arg VITE_API_BASE_URL=https://example.com .
  1. Verify all dependencies are pinned

Dev Image Not Updating

Issue: NERSC Spin not pulling latest :dev

Solutions:

  1. Verify image was built (check GitHub Actions)
  2. In Rancher, redeploy the workload: Workloads → Deployments → ⋮ → Redeploy
  3. Check that Pull Policy is set to Always for :dev tags

Wrong API URL in Frontend

Issue: Frontend connecting to wrong backend

Solutions:

  1. Check VITE_API_BASE_URL in workflow file
  2. Rebuild with manual dispatch and correct URL
  3. Verify environment-specific URLs:
  4. Dev: https://simboard-dev-api.e3sm.org
  5. Prod: https://simboard-api.e3sm.org

Workflow Not Triggering

Issue: Push to main doesn't trigger build

Solutions:

  1. Verify changes are in watched paths:
  2. Backend: backend/**
  3. Frontend: frontend/**
  4. Check workflow files exist and are on main branch
  5. Verify Actions are enabled in repository settings

Issue: Release tag doesn't trigger prod build

Solutions:

  1. Verify the tag follows the component convention:
  2. Backend: backend-vX.Y.Z
  3. Frontend: frontend-vX.Y.Z
  4. Ensure the tag was created via a published GitHub Release (draft releases do not create tags)
  5. Check the Actions tab for the corresponding workflow

Additional Resources

Support