Deployment and Release Guide
Audience: maintainers operating deployments and release rollouts.
Complete reference for CI/CD pipelines and NERSC Spin deployments.
Table of Contents
- Overview
- Quick Links
- Environment Architecture
- CI/CD Workflows
- GitHub Secrets Setup
- Image Tagging Strategy
- Development Deployment
- Production Release Process
- Database Migrations
- Rollback Procedure
- Manual Builds
- Troubleshooting
Overview
SimBoard uses GitHub Actions to automatically build and publish container images to the NERSC container registry (registry.nersc.gov/e3sm/simboard/).
Key Features:
- ✅ Automated dev builds from
mainbranch - ✅ Component-level production releases via GitHub Releases
- ✅ Independent frontend and backend versioning
- ✅ linux/amd64 architecture support
- ✅ Semantic versioning for production
- ✅ Docker Buildx with layer caching
- ✅ Separation via image tags and K8s namespaces
Quick Links
- Harbor Registry: https://registry.nersc.gov/harbor/projects
- Rancher Dashboard: https://rancher2.spin.nersc.gov/dashboard/home
- GitHub Actions: https://github.com/E3SM-Project/simboard/actions
- NERSC Spin Runbook (Rancher UI): docs/deploy/nersc-spin-runbook.md
Environment Architecture
Development
| Component | Hosting | Image | Pull Policy |
|---|---|---|---|
| Backend | NERSC Spin (dev) | backend:dev |
Always |
| Frontend | NERSC Spin (dev) | frontend:dev |
Always |
Trigger: Automatic on push to main
Production
| Component | Hosting | Image | Pull Policy |
|---|---|---|---|
| Backend | NERSC Spin (prod) | backend:X.Y.Z |
IfNotPresent |
| Frontend | NERSC Spin (prod) | frontend:X.Y.Z |
IfNotPresent |
Trigger: Component-scoped GitHub Release tag (for example, backend-vX.Y.Z, frontend-vX.Y.Z)
Note: Frontend and backend are versioned independently. Each component can be released on its own schedule without affecting the other.
CI/CD Workflows
Dev Builds (push to main)
Dev workflows build and push images tagged with :dev and :sha-<commit> whenever changes are pushed to main. These do not affect production images.
Backend Dev Workflow
Triggers: Push to main (backend changes) or manual dispatch
Tags: :dev, :sha-<commit>
Registry: registry.nersc.gov/e3sm/simboard/backend
Frontend Dev Workflow
Triggers: Push to main (frontend changes) or manual dispatch
Tags: :dev, :sha-<commit>
Build args:
VITE_API_BASE_URL:https://simboard-dev-api.e3sm.org(default)
Registry: registry.nersc.gov/e3sm/simboard/frontend
Release Builds (component-scoped tags)
Release workflows are triggered by component-scoped Git tags created through GitHub Releases. Each component has its own workflow and tag namespace. Release builds do not modify the :dev image.
Backend Prod Workflow
Triggers: Tag push matching backend-v*
Tags: :X.Y.Z, :sha-<commit>, :latest
Registry: registry.nersc.gov/e3sm/simboard/backend
Frontend Prod Workflow
Triggers: Tag push matching frontend-v*
Tags: :X.Y.Z, :sha-<commit>, :latest
Build args:
VITE_API_BASE_URL:https://simboard-api.e3sm.org(default, override in manual dispatch)
Registry: registry.nersc.gov/e3sm/simboard/frontend
Build Flow Summary
Dev builds: push to main → :dev, :sha-<short>
Release builds: component tag → :X.Y.Z, :sha-<short>, :latest
GitHub Secrets Setup
Required secrets: Configure in repository settings
- NERSC_REGISTRY_USERNAME
- Your NERSC username
-
Used for
docker login registry.nersc.gov -
NERSC_REGISTRY_PASSWORD
- Your NERSC password or access token
- Used for
docker login registry.nersc.gov
Test locally:
docker login registry.nersc.gov
# Use the same credentials
Security:
- Use service account tokens when available
- Rotate credentials on a schedule that matches current NERSC and project policy
- Never commit credentials to source code
Image Tagging Strategy
Development Images
| Tag | Description | Use Case |
|---|---|---|
:dev |
Latest from main |
Primary dev deployment |
:sha-a1b2c3d |
Specific commit | Debugging, rollback |
Production Images
| Tag | Description | Use Case |
|---|---|---|
:X.Y.Z |
Full version | Production (recommended) |
:latest |
Latest release | Reference only |
Best practice: Use full semantic versions (:X.Y.Z) in production for reproducibility.
Tag Convention
| Git Tag | Component | Docker Image Tag |
|---|---|---|
backend-vX.Y.Z |
Backend | registry.nersc.gov/e3sm/simboard/backend:X.Y.Z |
frontend-vX.Y.Z |
Frontend | registry.nersc.gov/e3sm/simboard/frontend:X.Y.Z |
Development Deployment
Update Dev Environment
Development images are automatically built and pushed when you push to main. To deploy the updated images on NERSC Spin, use the Rancher UI:
- Navigate to Workloads → Deployments in the dev namespace
- Find the backend or frontend deployment
- Click ⋮ → Redeploy to pull the latest
:devimage - Verify pods restart successfully in the Pods tab
Image Configuration
When creating or editing a workload in Rancher, set these values:
Dev backend:
- Image:
registry.nersc.gov/e3sm/simboard/backend:dev - Pull Policy: Always
Dev frontend:
- Image:
registry.nersc.gov/e3sm/simboard/frontend:dev - Pull Policy: Always
Production Release Process
Frontend and backend are released independently using component-scoped tags. Creating a GitHub Release with the appropriate tag triggers the corresponding CI workflow.
Step 1: Prepare Release
# Ensure main is up to date
git checkout main && git pull
# Run tests
make backend-test
make frontend-lint
Step 2a: Create GitHub Release (Frontend)
- Navigate to Releases
- Click Draft a new release
- In Choose a tag, enter a new tag following the convention:
frontend-vX.Y.Z
- Ensure the Target branch is
main - Set the release title (for example,
Frontend vX.Y.Z) - Add release notes summarizing the changes
- Click Publish release
Publishing the release creates the Git tag, which:
- Triggers the frontend release workflow for
frontend-v*tags - Builds the Docker image
- Pushes versioned tags (
:X.Y.Z,:sha-<short>,:latest) to the registry - Does not modify the
:devimage
Step 2b: Create GitHub Release (Backend)
Follow the same steps as above, but use a backend-scoped tag:
backend-vX.Y.Z
This triggers the backend release workflow and pushes backend-specific versioned tags.
Step 3: Monitor Builds
Check the Actions tab and confirm that only the workflow matching the component tag triggers.
Step 4: Deploy to Production
Update the image tags in the Rancher UI:
- Navigate to Workloads → Deployments in the prod namespace
- Click the target deployment → ⋮ → Edit Config
- Update the Image field to the new versioned image, e.g.:
- Backend:
registry.nersc.gov/e3sm/simboard/backend:X.Y.Z - Frontend:
registry.nersc.gov/e3sm/simboard/frontend:X.Y.Z - Set Pull Policy to
IfNotPresent - Click Save — Rancher will roll out the new version
For backend releases, migrations run automatically in a backend initContainer during rollout. See Database Migrations.
Step 5: Verify Production
- In Rancher, check that pods are Running under Workloads → Pods in the prod namespace
- Review pod logs via the ⋮ → View Logs action in Rancher
- Test endpoints:
https://simboard-api.e3sm.org/api/v1/healthhttps://simboard.e3sm.org/health
Database Migrations
Database migrations are executed by a backend Deployment initContainer during rollout, not on backend app startup.
Runtime Behavior
- Backend container starts the API directly and does not run migrations at startup.
- InitContainer runs before backend container start and executes
/app/migrate.sh. migrate.shvalidatesDATABASE_URL, waits for DB readiness, then runsalembic upgrade headby default.
Spin Workloads
Reference runbook:
-
Backend service/deployment baseline is defined for in-cluster API routing (
backendon8000). - Backend Deployment uses the image entrypoint directly (no app args required).
- Backend Deployment includes initContainer
migrateusing the same backend image tag to run Alembic before app start. - Frontend service/deployment baseline is defined for UI routing (
frontendon80). - Frontend Deployment uses the frontend image default CMD (no explicit args).
- DB service/deployment baseline is defined for in-cluster Postgres (
db). - Ingress baseline (
lb) terminates TLS viasimboard-tls-certand routes frontend/backend hosts. - Backend and migration initContainer env values are sourced via
envFromfrom secretsimboard-backend-env. - DB container env values are sourced via
envFromfrom secretsimboard-db.
Deployment Order (Required)
- Roll out backend deployment with the target image tag.
- Wait for initContainer migration step to succeed.
- Confirm backend pods become
RunningandReady.
If initContainer migration fails, backend pods will not become ready and rollout should be treated as failed.
Concurrency Note
InitContainers run per pod. If more than one backend pod is created simultaneously, migrations may execute concurrently.
Use an explicit rollout strategy that guarantees only one new pod (and therefore one migration initContainer) is created at a time:
spec:
replicas: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0
maxUnavailable: 1
Why this is required: with default RollingUpdate settings, Kubernetes may create a surge pod during updates, which can run a second migration initContainer even when the steady-state replica count is 1.
If you need replicas > 1, use a DB-level migration lock so only one initContainer can run Alembic at a time. For PostgreSQL, wrap migration execution with a single transaction-scoped advisory lock (for example, SELECT pg_advisory_lock(<fixed_key>); ... alembic upgrade head ...; SELECT pg_advisory_unlock(<fixed_key>);).
Production-safe recommendation: apply both controls (serialized rollout strategy plus DB-level lock) for defense in depth.
Rollback Caveat
Rolling back the backend container image does not roll back database schema automatically. Use backward-compatible migrations (expand/contract pattern), and use a separate, explicit rollback migration only when needed.
Rollback Procedure
Version-tagged images are immutable — once published, a version tag (for example, :X.Y.Z) always refers to the same image. This makes rollbacks safe and predictable.
Rolling Back via Rancher
- Open the Rancher UI
- Navigate to Workloads → Deployments in the prod namespace
- Click the deployment to roll back → ⋮ → Edit Config
- Change the Image tag to the previous known-good version, e.g.:
registry.nersc.gov/e3sm/simboard/backend:X.Y.Zregistry.nersc.gov/e3sm/simboard/frontend:X.Y.Z- Click Save to trigger the rollout
Alternatively, use the built-in Rancher rollback:
- Navigate to the deployment → ⋮ → Rollback
- Select the previous revision and confirm
Key Rollback Principles
- Version tags are immutable: A published
:X.Y.Ztag always points to the same image digest. You can safely redeploy any previously released version. - Components are independent: Rolling back the frontend does not require rolling back the backend, and vice versa.
:devis unaffected: Release rollbacks have no impact on the dev environment.- Use commit-based tags for precision: If you need to deploy a specific build, use the
:sha-<short>tag from the GitHub Actions build log.
Manual Builds
For testing or emergency builds, you can manually build and push images using Docker Buildx. This is not recommended for regular use, as it bypasses CI checks and versioning conventions.
First login to the NERSC registry:
docker login registry.nersc.gov
Backend
cd backend
docker buildx build \
--platform=linux/amd64,linux/arm64 \
--build-arg ENV=production \
-t registry.nersc.gov/e3sm/simboard/backend:dev-manual \
--push \
.
Frontend (with API URL override)
# Development
cd frontend
docker buildx build \
--platform=linux/amd64,linux/arm64 \
--build-arg VITE_API_BASE_URL=https://simboard-dev-api.e3sm.org \
-t registry.nersc.gov/e3sm/simboard/frontend:dev-manual \
--push \
.
# Production
cd frontend
docker buildx build \
--platform=linux/amd64,linux/arm64 \
--build-arg VITE_API_BASE_URL=https://simboard-api.e3sm.org \
-t registry.nersc.gov/e3sm/simboard/frontend:prod-manual \
--push \
.
Troubleshooting
Authentication Failures
Issue: denied: requested access to the resource is denied
Solutions:
- Verify GitHub Secrets are configured
- Test credentials:
docker login registry.nersc.gov - Check NERSC account has push permissions to
e3sm/simboard/namespace
Build Failures
Issue: Workflow fails during build
Solutions:
- Check workflow logs in Actions tab
- Test Dockerfile locally:
cd backend && docker build .
cd frontend && docker build --build-arg VITE_API_BASE_URL=https://example.com .
- Verify all dependencies are pinned
Dev Image Not Updating
Issue: NERSC Spin not pulling latest :dev
Solutions:
- Verify image was built (check GitHub Actions)
- In Rancher, redeploy the workload: Workloads → Deployments → ⋮ → Redeploy
- Check that Pull Policy is set to
Alwaysfor:devtags
Wrong API URL in Frontend
Issue: Frontend connecting to wrong backend
Solutions:
- Check
VITE_API_BASE_URLin workflow file - Rebuild with manual dispatch and correct URL
- Verify environment-specific URLs:
- Dev:
https://simboard-dev-api.e3sm.org - Prod:
https://simboard-api.e3sm.org
Workflow Not Triggering
Issue: Push to main doesn't trigger build
Solutions:
- Verify changes are in watched paths:
- Backend:
backend/** - Frontend:
frontend/** - Check workflow files exist and are on
mainbranch - Verify Actions are enabled in repository settings
Issue: Release tag doesn't trigger prod build
Solutions:
- Verify the tag follows the component convention:
- Backend:
backend-vX.Y.Z - Frontend:
frontend-vX.Y.Z - Ensure the tag was created via a published GitHub Release (draft releases do not create tags)
- Check the Actions tab for the corresponding workflow
Additional Resources
- NERSC Container Registry Docs
- NERSC Spin Docs
- GitHub Actions Docs
- Docker Buildx Docs
- Semantic Versioning
Support
- GitHub Issues: Open an issue
- Workflow Logs: Actions tab