Skip to main content

Kubernetes Deployment: Helm Chart

#kubernetes #k8s #helm #deployment

Overview

k8s/ contains Kubernetes manifests and Helm chart for deploying Alexandria EE in production.

Purpose: Container orchestration, multi-replica scaling, federation support, observability integration, storage management.

Approach:

  • Helm chart (primary) — helm/alexandria-ee/ with values-driven templating
  • Kustomize manifests (alternative) — manifests/ for GitOps workflows
  • Pod structure — Three co-located containers in one pod (tight coupling for IPC)

Pod Architecture

Containers in One Pod

The deployment runs three containers in a single pod:

  1. api — Go REST API daemon (port 8080)
  2. orchestrator — CE Rust orchestrator (Unix socket IPC)
  3. dashboard — React SPA via nginx (port 80, proxies API)

Why co-locate?

  • Unix domain socket IPC between API and orchestrator (no network overhead)
  • Shared PVC for SQLite, JWT secret, models, knowledge store
  • Simplified networking (single ClusterIP service)
  • Shared logs (both containers in one pod)

IPC and Shared Storage

Pod: alexandria-0
├── Container: api (port 8080)
│ ├── Listen 0.0.0.0:8080
│ ├── Read /var/lib/alexandria/data/alexandria.db (SQLite)
│ └── Connect to /var/run/alexandria/orchestrator.sock (Unix socket)

├── Container: orchestrator
│ ├── Listen /var/run/alexandria/orchestrator.sock (gRPC)
│ └── Read/write /var/lib/alexandria/ (stores, agents, models, etc.)

├── Container: dashboard (nginx)
│ ├── Listen 0.0.0.0:80
│ └── Proxy /v1/* /admin/* /auth/* to http://localhost:8080

└── Volumes
├── data (PVC) → /var/lib/alexandria (persistent)
├── run (emptyDir) → /var/run/alexandria (sockets, tempfiles)
└── etc-alexandria (emptyDir) → /etc/alexandria (config)

Design choice: Tight coupling for performance. If scaling is needed, split into separate pods with network-based gRPC (requires Postgres for shared state).


Helm Chart Structure

helm/alexandria-ee/
├── Chart.yaml Chart metadata (name, version, appVersion)
├── values.yaml Default values (replicas, images, config, auth, federation, etc.)
├── values.schema.json JSON schema for values validation
├── templates/
│ ├── _helpers.tpl Shared template functions
│ ├── deployment.yaml Pod/Deployment spec (3 containers)
│ ├── service.yaml ClusterIP service
│ ├── ingress.yaml Ingress (optional)
│ ├── configmap.yaml Alexandria TOML config
│ ├── secret.yaml JWT secret + admin password
│ ├── serviceaccount.yaml RBAC ServiceAccount
│ ├── pvc.yaml PersistentVolumeClaim
│ ├── hpa.yaml HorizontalPodAutoscaler
│ ├── pdb.yaml PodDisruptionBudget (optional)
│ ├── networkpolicy.yaml NetworkPolicy (optional)
│ ├── saml-cert-job.yaml Job to generate SAML SP keypair
│ ├── servicemonitor.yaml Prometheus ServiceMonitor
│ ├── rbac.yaml ClusterRole, ClusterRoleBinding
│ ├── nginx-configmap.yaml nginx config for dashboard
│ └── NOTES.txt Post-install instructions

Key Configuration Sections

Replica Count & Scaling

replicaCount: 1

autoscaling:
enabled: false
minReplicas: 1
maxReplicas: 5
targetCPUUtilizationPercentage: 80

Default: 1 replica (single instance, suitable for dev/small deployments).

Scaling up: Enable HPA + Postgres backend (SQLite is single-write, won't support multiple replicas).

Container Images

image:
api:
repository: ghcr.io/alexandriaproject/ee-api
tag: "" # Pin to v1.0.0 in production
pullPolicy: IfNotPresent
orchestrator:
repository: ghcr.io/alexandriaproject/ce
tag: ""
pullPolicy: IfNotPresent
dashboard:
repository: ghcr.io/alexandriaproject/ee-dashboard
tag: ""
pullPolicy: IfNotPresent

Tag strategy:

  • Dev: "" (uses latest)
  • Staging: main or specific commit hash
  • Production: Pin to release tag (e.g., v1.0.0)

Persistence

persistence:
enabled: true
storageClass: "" # "" = cluster default
size: 10Gi
accessMode: ReadWriteOnce
existingClaim: ""

Mounts:

  • data PVC → /var/lib/alexandria (SQLite DB, JWT secret, models, knowledge store)
  • run emptyDir → /var/run/alexandria (Unix sockets, temp files, cleaned up on pod restart)
  • etc-alexandria emptyDir → /etc/alexandria (config, writable copy from ConfigMap)

Design: ReadWriteOnce (single pod). For multi-replica, switch to Postgres and ReadWriteMany.

Alexandria Configuration

config:
general:
logLevel: info
devMode: false

api:
bind: "0.0.0.0"
port: 8080
impl: go # REQUIRED for EE

webauthn:
rpId: "" # MUST override for production
rpOrigin: "" # MUST override for production
rpName: "Alexandria"

database:
url: "" # Postgres DSN (optional, SQLite is default)

security:
enabled: true
csp: ""
hstsMaxAge: 0 # Set to 31536000 (1 year) for TLS
hstsIncludeSubdomains: false

lockout:
enabled: true
maxAttempts: 10
windowSeconds: 300
lockoutSeconds: 900

All values are templated into a ConfigMap and mounted as /etc/alexandria/alexandria.conf.

Authentication Secrets

auth:
jwtSecret: "" # Auto-generated if blank (NOT recommended for prod)
adminPassword: "changeme" # Change in production!
existingSecret: "" # Use pre-existing Secret instead of creating one

Production pattern:

  1. Generate JWT secret: openssl rand -base64 32
  2. Create Secret: kubectl create secret generic alexandria-auth --from-literal=jwt-secret=<secret>
  3. Set auth.existingSecret: alexandria-auth
  4. Helm chart mounts it as env vars for the containers

Federation (OIDC/SAML)

federation:
enabled: false

providers: []
# - name: okta
# kind: oidc
# issuer: https://your-org.okta.com
# client_id: 0oaXXXX
# redirect_uri: https://alexandria.example.com/auth/oidc/okta/callback
# auto_create_users: true
# default_role: user

OIDC client secrets: Supplied via Kubernetes Secret:

kubectl create secret generic alexandria-oidc \
--from-literal=oidc-okta-client-secret=<secret> \
--from-literal=oidc-azure-client-secret=<secret>

SAML SP key pair: Auto-generated on first boot (persisted on PVC) or supplied via Secret:

kubectl create secret tls alexandria-saml \
--cert=saml.crt \
--key=saml.key

LLM Backend Registration

llmBackends: []
# - name: claude
# url: https://api.anthropic.com
# kind: claude
# model: claude-sonnet-4-6
# apiKeyEnvVar: ANTHROPIC_API_KEY

An init container runs alexandria llm add for each backend before the orchestrator starts.

Resource Requests/Limits

resources:
api:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
orchestrator:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 2000m
memory: 2Gi
dashboard:
requests:
cpu: 10m
memory: 16Mi
limits:
cpu: 100m
memory: 64Mi

Tuning:

  • API: Light (handles HTTP routing)
  • Orchestrator: Heavy (runs LLM queries, manages agents)
  • Dashboard: Minimal (static SPA served by nginx)

Deployment Lifecycle

Pre-Deployment

  1. Create namespace: kubectl create ns alexandria
  2. Create secrets (if not using auto-generated):
    kubectl create secret generic alexandria-auth \
    --from-literal=jwt-secret=<secret> \
    --from-literal=admin-password=<password> \
    -n alexandria
  3. Create OIDC secrets (if federation enabled):
    kubectl create secret generic alexandria-oidc \
    --from-literal=oidc-okta-client-secret=<secret> \
    -n alexandria

Helm Install

The chart is not published to a public Helm repository. Install from a local clone of this repository or from your internal Helm chart registry if you have pushed the chart there.

From a local clone:

git clone <your-internal-mirror-of-alexandria-ee> alexandria-ee-src
cd alexandria-ee-src

# Dry-run first to verify rendering
helm upgrade --install alexandria k8s/helm/alexandria-ee/ \
--namespace alexandria \
--create-namespace \
--values values-prod.yaml \
--dry-run --debug

# Apply
helm upgrade --install alexandria k8s/helm/alexandria-ee/ \
--namespace alexandria \
--create-namespace \
--values values-prod.yaml

From an internal OCI / chart registry (if your organisation pushes the chart):

# Push the chart to your registry (do this once per release):
helm package k8s/helm/alexandria-ee/
helm push alexandria-ee-0.1.0.tgz oci://registry.example.com/charts/

# Install from the registry:
helm upgrade --install alexandria \
oci://registry.example.com/charts/alexandria-ee \
--version 0.1.0 \
--namespace alexandria \
--create-namespace \
--values values-prod.yaml

Or with inline overrides:

helm upgrade --install alexandria k8s/helm/alexandria-ee/ \
--namespace alexandria \
--create-namespace \
--set image.api.tag=v0.2.2 \
--set image.orchestrator.tag=v0.2.2 \
--set image.dashboard.tag=v0.2.2 \
--set config.webauthn.rpId=example.com \
--set config.webauthn.rpOrigin=https://example.com \
--set auth.existingSecret=alexandria-auth \
--set persistence.size=20Gi

Init Containers

The deployment includes init containers that run before the pod starts:

  1. config-init — Copy read-only ConfigMap to writable emptyDir
  2. llm-init (optional) — Register LLM backends via alexandria llm add
  3. saml-cert-job (optional, pre-install) — Generate SAML SP keypair

Startup Sequence

  1. Init containers run
  2. API container starts (listens on 8080)
  3. Orchestrator container starts (creates Unix socket at /var/run/alexandria/orchestrator.sock)
  4. Dashboard (nginx) container starts (proxies to API on localhost:8080)
  5. Readiness probe: GET /ready on API succeeds
  6. Pod is ready to serve traffic

Shutdown Sequence

On pod deletion:

  1. SIGTERM sent to all containers
  2. 30-second grace period (configurable via terminationGracePeriodSeconds)
  3. In-flight requests drain
  4. Containers exit
  5. Volumes unmounted

Service & Networking

Service

service:
type: ClusterIP # Internal only (use Ingress for external)
port: 80
targetPort: 8080 # Points to dashboard (nginx on port 80)

Ports exposed:

  • Port 80 (service) → nginx container port 80 → proxies to API port 8080

Access within cluster: http://alexandria:80 or http://alexandria.alexandria.svc.cluster.local

Ingress

ingress:
enabled: false
className: "nginx"
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- host: alexandria.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: alexandria-tls
hosts:
- alexandria.example.com

To enable:

helm upgrade --install alexandria k8s/helm/alexandria-ee/ \
--reuse-values \
--set ingress.enabled=true \
--set ingress.hosts[0].host=alexandria.example.com

Network Policy

networkPolicy:
enabled: false
additionalIngress: []
additionalEgress: []

When enabled, only allows:

  • Ingress from ingress controller (or specified pods)
  • Egress to DNS (53/UDP) + any specified destinations

Health Checks

Liveness Probe

livenessProbe:
httpGet:
path: /live # API is bound to listener
port: api
initialDelaySeconds: 10
periodSeconds: 15
failureThreshold: 3

Purpose: Kill unresponsive pods. If pod is unresponsive for 3×15s, Kubernetes restarts it.

Readiness Probe

readinessProbe:
httpGet:
path: /ready # API is ready to serve
port: api
initialDelaySeconds: 5
periodSeconds: 10
failureThreshold: 3

Purpose: Remove pod from service endpoints until ready. Pod must pass readiness before traffic is sent.

Dashboard Probe

nginx container has inline probes on port 80.


Observability

OpenTelemetry

otel:
enabled: false
endpoint: "http://otel-collector:4318"

When enabled, API and orchestrator send traces + metrics to OTLP HTTP collector.

To enable:

helm upgrade --install alexandria k8s/helm/alexandria-ee/ \
--reuse-values \
--set otel.enabled=true \
--set otel.endpoint=http://otel-collector:4318

Prometheus Metrics

metrics:
enabled: false
path: /metrics
serviceMonitor:
enabled: false
interval: "30s"

When enabled, exposes Prometheus metrics at /metrics. ServiceMonitor allows Prometheus Operator to scrape automatically.

Logs

Logs are written to stdout (JSON format) and captured by Kubernetes.

View logs:

kubectl logs -f deployment/alexandria -n alexandria -c api
kubectl logs -f deployment/alexandria -n alexandria -c orchestrator

Advanced Configuration

Horizontal Pod Autoscaler

autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 80

Requirements:

  • Metrics server installed (kubectl apply -f https://...)
  • Postgres backend (SQLite won't work with multiple replicas)
  • Persistent volume with ReadWriteMany (or distributed storage like NFS/EBS)

Behavior: Scales up/down based on CPU usage. Orchestrator is the heavy container; API is I/O-bound.

Pod Disruption Budget

pdb:
enabled: true
minAvailable: 1

Ensures at least 1 pod is always running during voluntary disruptions (node drains, upgrades).

External Secrets Operator

externalSecrets:
enabled: false
secretStoreName: "vault"
refreshInterval: "1h"
adminPasswordKey: "alexandria/admin-password"
jwtSecretKey: "alexandria/jwt-secret"

Syncs JWT secret and admin password from Vault (or AWS Secrets Manager, Azure Key Vault, etc.).


Troubleshooting

Pod won't start

  1. Check events: kubectl describe pod -n alexandria <pod-name>
  2. Check init container logs: kubectl logs -n alexandria <pod-name> -c config-init
  3. Check data volume: kubectl get pvc -n alexandria

API not responding

  1. Check readiness: kubectl get pods -n alexandria
  2. Check logs: kubectl logs -f -n alexandria <pod> -c api
  3. Test endpoint: kubectl port-forward -n alexandria svc/alexandria 8080:80 then curl http://localhost:8080/health

Orchestrator communication error

  1. Check both containers are running: kubectl get pods -n alexandria
  2. Check socket: kubectl exec -it <pod> -c api -- ls -la /var/run/alexandria/
  3. Check logs: kubectl logs -f -n alexandria <pod> -c orchestrator

Permission denied on PVC

  1. Check pod security context: grep -A5 podSecurityContext values.yaml
  2. Ensure PVC is mounted with correct uid/gid (should be 1000/1000)
  3. Check volume ownership: kubectl exec -it <pod> -- ls -la /var/lib/alexandria/

Production Deployment Checklist

  • Set image.api.tag, image.orchestrator.tag, image.dashboard.tag to specific release versions
  • Set config.webauthn.rpId and rpOrigin to your domain
  • Create and use auth.existingSecret (don't rely on auto-generated values)
  • Set config.security.hstsMaxAge: 31536000 (with TLS ingress)
  • Configure Postgres backend: config.database.url: postgres://...
  • Enable Ingress: ingress.enabled: true, set hostname
  • Configure federation providers (OIDC/SAML) if using SSO
  • Enable HPA for multi-replica scaling
  • Enable PDB for resilience
  • Configure RBAC (ServiceAccount, ClusterRole)
  • Set resource requests/limits appropriately
  • Enable OpenTelemetry or Prometheus metrics
  • Test health checks: kubectl get pods, verify STATUS=Running, READY=3/3
  • Test functionality: login, create agent, run query

Kustomize Manifests

Alternative to Helm: raw Kubernetes manifests in manifests/.

Use Kustomize when:

  • GitOps workflow (ArgoCD, Flux)
  • Need patch-based customization
  • Prefer declarative resource lists

Apply:

kubectl apply -k manifests/

Customize:

kustomization.yaml
├── deployment.yaml
├── service.yaml
├── configmap.yaml
└── kustomization.yaml (adds patches, overlays)

References

  • Helm chart: helm/alexandria-ee/
  • Values: values.yaml (documented inline)
  • Deployment: templates/deployment.yaml
  • Config: templates/configmap.yaml
  • NOTES: templates/NOTES.txt (post-install instructions)