Runbook: Helm Upgrade
When to use this
Use this runbook whenever you are upgrading the Alexandria EE Helm release to a new chart version or image tag. Covers the pre-upgrade snapshot, the upgrade command, rollout verification, and rollback path.
Pre-checks
- Confirm the target image tag exists in Artifact Registry before upgrading.
- Record the current release revision:
helm history alexandria-ee -n <namespace> --max 5 - Verify pods are currently healthy:
kubectl get pods -n <namespace> -l app.kubernetes.io/instance=alexandria-ee - Check that the Postgres DSN secret is present and reachable from the cluster.
- If upgrading across a minor that touches the DB schema, read the release notes for migration notes —
api-go/internal/db/migrations.go::ApplySchemaruns automatically on startup, but destructive migrations require manual intervention. - Take a Postgres snapshot before proceeding (see backup-restore.md).
Procedure
-
Pull the new chart version or update your local chart directory.
-
Identify the tier values file for your deployment (
values-starter.yaml,values-professional.yaml, orvalues-enterprise.yaml). -
Pin the image tags in your site-specific values file. Never rely on
""(latest) in production:image:api:tag: "v1.2.3"orchestrator:tag: "v1.2.3"dashboard:tag: "v1.2.3" -
Run the upgrade with
--atomicso Helm auto-rolls back on timeout:helm upgrade alexandria-ee k8s/helm/alexandria-ee/ \-n <namespace> \-f k8s/helm/alexandria-ee/values.yaml \-f k8s/helm/alexandria-ee/values-<tier>.yaml \-f /path/to/site-values.yaml \--atomic \--timeout 10m -
Watch pod rollout in a second terminal:
kubectl rollout status deployment/alexandria-ee -n <namespace> --timeout=10m
Verification
- Health endpoint:
curl -sf https://<host>/readyshould return200 OK. - Setup screen (first install only):
curl -sf https://<host>/auth/setupreturns200if no admin exists yet. - License endpoint:
curl -sf https://<host>/license— confirmtier,current_seats,expires_atfields look correct. - Check pod logs for startup errors:
kubectl logs -n <namespace> -l app.kubernetes.io/instance=alexandria-ee -c api --tail=50
Rollback
If the upgrade fails and --atomic did not trigger (e.g., you omitted it):
# See available revisions
helm history alexandria-ee -n <namespace>
# Roll back to the previous revision
helm rollback alexandria-ee <revision> -n <namespace> --timeout 5m
After rollback, re-verify the /ready and /license endpoints. If the schema migration ran and is not reversible, restore from the pre-upgrade Postgres snapshot — see backup-restore.md.