Kubernetes Deployment: Helm Chart

#kubernetes #k8s #helm #deployment

Overview

k8s/ contains Kubernetes manifests and Helm chart for deploying Alexandria EE in production.

Purpose: Container orchestration, multi-replica scaling, federation support, observability integration, storage management.

Approach:

Helm chart (primary) — helm/alexandria-ee/ with values-driven templating
Kustomize manifests (alternative) — manifests/ for GitOps workflows
Pod structure — Three co-located containers in one pod (tight coupling for IPC)

Pod Architecture

Containers in One Pod

The deployment runs three containers in a single pod:

api — Go REST API daemon (port 8080)
orchestrator — CE Rust orchestrator (Unix socket IPC)
dashboard — React SPA via nginx (port 80, proxies API)

Why co-locate?

Unix domain socket IPC between API and orchestrator (no network overhead)
Shared PVC for SQLite, JWT secret, models, knowledge store
Simplified networking (single ClusterIP service)
Shared logs (both containers in one pod)

IPC and Shared Storage

Pod: alexandria-0
├── Container: api (port 8080)
│   ├── Listen 0.0.0.0:8080
│   ├── Read /var/lib/alexandria/data/alexandria.db (SQLite)
│   └── Connect to /var/run/alexandria/orchestrator.sock (Unix socket)
│
├── Container: orchestrator
│   ├── Listen /var/run/alexandria/orchestrator.sock (gRPC)
│   └── Read/write /var/lib/alexandria/ (stores, agents, models, etc.)
│
├── Container: dashboard (nginx)
│   ├── Listen 0.0.0.0:80
│   └── Proxy /v1/* /admin/* /auth/* to http://localhost:8080
│
└── Volumes
    ├── data (PVC) → /var/lib/alexandria (persistent)
    ├── run (emptyDir) → /var/run/alexandria (sockets, tempfiles)
    └── etc-alexandria (emptyDir) → /etc/alexandria (config)

Design choice: Tight coupling for performance. If scaling is needed, split into separate pods with network-based gRPC (requires Postgres for shared state).

Helm Chart Structure

helm/alexandria-ee/
├── Chart.yaml              Chart metadata (name, version, appVersion)
├── values.yaml              Default values (replicas, images, config, auth, federation, etc.)
├── values.schema.json       JSON schema for values validation
├── templates/
│   ├── _helpers.tpl         Shared template functions
│   ├── deployment.yaml      Pod/Deployment spec (3 containers)
│   ├── service.yaml         ClusterIP service
│   ├── ingress.yaml         Ingress (optional)
│   ├── configmap.yaml       Alexandria TOML config
│   ├── secret.yaml          JWT secret + admin password
│   ├── serviceaccount.yaml  RBAC ServiceAccount
│   ├── pvc.yaml             PersistentVolumeClaim
│   ├── hpa.yaml             HorizontalPodAutoscaler
│   ├── pdb.yaml             PodDisruptionBudget (optional)
│   ├── networkpolicy.yaml   NetworkPolicy (optional)
│   ├── saml-cert-job.yaml   Job to generate SAML SP keypair
│   ├── servicemonitor.yaml  Prometheus ServiceMonitor
│   ├── rbac.yaml            ClusterRole, ClusterRoleBinding
│   ├── nginx-configmap.yaml nginx config for dashboard
│   └── NOTES.txt            Post-install instructions

Key Configuration Sections

Replica Count & Scaling

replicaCount: 1

autoscaling:
  enabled: false
  minReplicas: 1
  maxReplicas: 5
  targetCPUUtilizationPercentage: 80

Default: 1 replica (single instance, suitable for dev/small deployments).

Scaling up: Enable HPA + Postgres backend (SQLite is single-write, won't support multiple replicas).

Container Images

image:
  api:
    repository: ghcr.io/alexandriaproject/ee-api
    tag: ""                      # Pin to v1.0.0 in production
    pullPolicy: IfNotPresent
  orchestrator:
    repository: ghcr.io/alexandriaproject/ce
    tag: ""
    pullPolicy: IfNotPresent
  dashboard:
    repository: ghcr.io/alexandriaproject/ee-dashboard
    tag: ""
    pullPolicy: IfNotPresent

Tag strategy:

Dev: "" (uses latest)
Staging: main or specific commit hash
Production: Pin to release tag (e.g., v1.0.0)

Persistence

persistence:
  enabled: true
  storageClass: ""              # "" = cluster default
  size: 10Gi
  accessMode: ReadWriteOnce
  existingClaim: ""

Mounts:

data PVC → /var/lib/alexandria (SQLite DB, JWT secret, models, knowledge store)
run emptyDir → /var/run/alexandria (Unix sockets, temp files, cleaned up on pod restart)
etc-alexandria emptyDir → /etc/alexandria (config, writable copy from ConfigMap)

Design: ReadWriteOnce (single pod). For multi-replica, switch to Postgres and ReadWriteMany.

Alexandria Configuration

config:
  general:
    logLevel: info
    devMode: false

  api:
    bind: "0.0.0.0"
    port: 8080
    impl: go                    # REQUIRED for EE

  webauthn:
    rpId: ""                    # MUST override for production
    rpOrigin: ""                # MUST override for production
    rpName: "Alexandria"

  database:
    url: ""                     # Postgres DSN (optional, SQLite is default)

  security:
    enabled: true
    csp: ""
    hstsMaxAge: 0               # Set to 31536000 (1 year) for TLS
    hstsIncludeSubdomains: false

  lockout:
    enabled: true
    maxAttempts: 10
    windowSeconds: 300
    lockoutSeconds: 900

All values are templated into a ConfigMap and mounted as /etc/alexandria/alexandria.conf.

Authentication Secrets

auth:
  jwtSecret: ""                 # Auto-generated if blank (NOT recommended for prod)
  adminPassword: "changeme"     # Change in production!
  existingSecret: ""            # Use pre-existing Secret instead of creating one

Production pattern:

Generate JWT secret: openssl rand -base64 32
Create Secret: kubectl create secret generic alexandria-auth --from-literal=jwt-secret=<secret>
Set auth.existingSecret: alexandria-auth
Helm chart mounts it as env vars for the containers

Federation (OIDC/SAML)

federation:
  enabled: false

  providers: []
  # - name: okta
  #   kind: oidc
  #   issuer: https://your-org.okta.com
  #   client_id: 0oaXXXX
  #   redirect_uri: https://alexandria.example.com/auth/oidc/okta/callback
  #   auto_create_users: true
  #   default_role: user

OIDC client secrets: Supplied via Kubernetes Secret:

kubectl create secret generic alexandria-oidc \
  --from-literal=oidc-okta-client-secret=<secret> \
  --from-literal=oidc-azure-client-secret=<secret>

SAML SP key pair: Auto-generated on first boot (persisted on PVC) or supplied via Secret:

kubectl create secret tls alexandria-saml \
  --cert=saml.crt \
  --key=saml.key

LLM Backend Registration

llmBackends: []
# - name: claude
#   url: https://api.anthropic.com
#   kind: claude
#   model: claude-sonnet-4-6
#   apiKeyEnvVar: ANTHROPIC_API_KEY

An init container runs alexandria llm add for each backend before the orchestrator starts.

Resource Requests/Limits

resources:
  api:
    requests:
      cpu: 100m
      memory: 128Mi
    limits:
      cpu: 500m
      memory: 512Mi
  orchestrator:
    requests:
      cpu: 250m
      memory: 256Mi
    limits:
      cpu: 2000m
      memory: 2Gi
  dashboard:
    requests:
      cpu: 10m
      memory: 16Mi
    limits:
      cpu: 100m
      memory: 64Mi

Tuning:

API: Light (handles HTTP routing)
Orchestrator: Heavy (runs LLM queries, manages agents)
Dashboard: Minimal (static SPA served by nginx)

Deployment Lifecycle

Pre-Deployment

Create namespace: kubectl create ns alexandria

Create secrets (if not using auto-generated):

kubectl create secret generic alexandria-auth \
  --from-literal=jwt-secret=<secret> \
  --from-literal=admin-password=<password> \
  -n alexandria

Create OIDC secrets (if federation enabled):

kubectl create secret generic alexandria-oidc \
  --from-literal=oidc-okta-client-secret=<secret> \
  -n alexandria

Helm Install

The chart is not published to a public Helm repository. Install from a local clone of this repository or from your internal Helm chart registry if you have pushed the chart there.

From a local clone:

git clone <your-internal-mirror-of-alexandria-ee> alexandria-ee-src
cd alexandria-ee-src

# Dry-run first to verify rendering
helm upgrade --install alexandria k8s/helm/alexandria-ee/ \
  --namespace alexandria \
  --create-namespace \
  --values values-prod.yaml \
  --dry-run --debug

# Apply
helm upgrade --install alexandria k8s/helm/alexandria-ee/ \
  --namespace alexandria \
  --create-namespace \
  --values values-prod.yaml

From an internal OCI / chart registry (if your organisation pushes the chart):

# Push the chart to your registry (do this once per release):
helm package k8s/helm/alexandria-ee/
helm push alexandria-ee-0.1.0.tgz oci://registry.example.com/charts/

# Install from the registry:
helm upgrade --install alexandria \
  oci://registry.example.com/charts/alexandria-ee \
  --version 0.1.0 \
  --namespace alexandria \
  --create-namespace \
  --values values-prod.yaml

Or with inline overrides:

helm upgrade --install alexandria k8s/helm/alexandria-ee/ \
  --namespace alexandria \
  --create-namespace \
  --set image.api.tag=v0.2.2 \
  --set image.orchestrator.tag=v0.2.2 \
  --set image.dashboard.tag=v0.2.2 \
  --set config.webauthn.rpId=example.com \
  --set config.webauthn.rpOrigin=https://example.com \
  --set auth.existingSecret=alexandria-auth \
  --set persistence.size=20Gi

Init Containers

The deployment includes init containers that run before the pod starts:

config-init — Copy read-only ConfigMap to writable emptyDir
llm-init (optional) — Register LLM backends via alexandria llm add
saml-cert-job (optional, pre-install) — Generate SAML SP keypair

Startup Sequence

Init containers run
API container starts (listens on 8080)
Orchestrator container starts (creates Unix socket at /var/run/alexandria/orchestrator.sock)
Dashboard (nginx) container starts (proxies to API on localhost:8080)
Readiness probe: GET /ready on API succeeds
Pod is ready to serve traffic

Shutdown Sequence

On pod deletion:

SIGTERM sent to all containers
30-second grace period (configurable via terminationGracePeriodSeconds)
In-flight requests drain
Containers exit
Volumes unmounted

Service & Networking

Service

service:
  type: ClusterIP              # Internal only (use Ingress for external)
  port: 80
  targetPort: 8080             # Points to dashboard (nginx on port 80)

Ports exposed:

Port 80 (service) → nginx container port 80 → proxies to API port 8080

Access within cluster: http://alexandria:80 or http://alexandria.alexandria.svc.cluster.local

Ingress

ingress:
  enabled: false
  className: "nginx"
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
  hosts:
    - host: alexandria.example.com
      paths:
        - path: /
          pathType: Prefix
  tls:
    - secretName: alexandria-tls
      hosts:
        - alexandria.example.com

To enable:

helm upgrade --install alexandria k8s/helm/alexandria-ee/ \
  --reuse-values \
  --set ingress.enabled=true \
  --set ingress.hosts[0].host=alexandria.example.com

Network Policy

networkPolicy:
  enabled: false
  additionalIngress: []
  additionalEgress: []

When enabled, only allows:

Ingress from ingress controller (or specified pods)
Egress to DNS (53/UDP) + any specified destinations

Health Checks

Liveness Probe

livenessProbe:
  httpGet:
    path: /live                # API is bound to listener
    port: api
  initialDelaySeconds: 10
  periodSeconds: 15
  failureThreshold: 3

Purpose: Kill unresponsive pods. If pod is unresponsive for 3×15s, Kubernetes restarts it.

Readiness Probe

readinessProbe:
  httpGet:
    path: /ready               # API is ready to serve
    port: api
  initialDelaySeconds: 5
  periodSeconds: 10
  failureThreshold: 3

Purpose: Remove pod from service endpoints until ready. Pod must pass readiness before traffic is sent.

Dashboard Probe

nginx container has inline probes on port 80.

Observability

OpenTelemetry

otel:
  enabled: false
  endpoint: "http://otel-collector:4318"

When enabled, API and orchestrator send traces + metrics to OTLP HTTP collector.

To enable:

helm upgrade --install alexandria k8s/helm/alexandria-ee/ \
  --reuse-values \
  --set otel.enabled=true \
  --set otel.endpoint=http://otel-collector:4318

Prometheus Metrics

metrics:
  enabled: false
  path: /metrics
  serviceMonitor:
    enabled: false
    interval: "30s"

When enabled, exposes Prometheus metrics at /metrics. ServiceMonitor allows Prometheus Operator to scrape automatically.

Logs

Logs are written to stdout (JSON format) and captured by Kubernetes.

View logs:

kubectl logs -f deployment/alexandria -n alexandria -c api
kubectl logs -f deployment/alexandria -n alexandria -c orchestrator

Advanced Configuration

Horizontal Pod Autoscaler

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 80

Requirements:

Metrics server installed (kubectl apply -f https://...)
Postgres backend (SQLite won't work with multiple replicas)
Persistent volume with ReadWriteMany (or distributed storage like NFS/EBS)

Behavior: Scales up/down based on CPU usage. Orchestrator is the heavy container; API is I/O-bound.

Pod Disruption Budget

pdb:
  enabled: true
  minAvailable: 1

Ensures at least 1 pod is always running during voluntary disruptions (node drains, upgrades).

External Secrets Operator

externalSecrets:
  enabled: false
  secretStoreName: "vault"
  refreshInterval: "1h"
  adminPasswordKey: "alexandria/admin-password"
  jwtSecretKey: "alexandria/jwt-secret"

Syncs JWT secret and admin password from Vault (or AWS Secrets Manager, Azure Key Vault, etc.).

Troubleshooting

Pod won't start

Check events: kubectl describe pod -n alexandria <pod-name>
Check init container logs: kubectl logs -n alexandria <pod-name> -c config-init
Check data volume: kubectl get pvc -n alexandria

API not responding

Check readiness: kubectl get pods -n alexandria
Check logs: kubectl logs -f -n alexandria <pod> -c api
Test endpoint: kubectl port-forward -n alexandria svc/alexandria 8080:80 then curl http://localhost:8080/health

Orchestrator communication error

Check both containers are running: kubectl get pods -n alexandria
Check socket: kubectl exec -it <pod> -c api -- ls -la /var/run/alexandria/
Check logs: kubectl logs -f -n alexandria <pod> -c orchestrator

Permission denied on PVC

Check pod security context: grep -A5 podSecurityContext values.yaml
Ensure PVC is mounted with correct uid/gid (should be 1000/1000)
Check volume ownership: kubectl exec -it <pod> -- ls -la /var/lib/alexandria/

Production Deployment Checklist

Kustomize Manifests

Alternative to Helm: raw Kubernetes manifests in manifests/.

Use Kustomize when:

GitOps workflow (ArgoCD, Flux)
Need patch-based customization
Prefer declarative resource lists

Apply:

kubectl apply -k manifests/

Customize:

kustomization.yaml
├── deployment.yaml
├── service.yaml
├── configmap.yaml
└── kustomization.yaml (adds patches, overlays)

References

Helm chart: helm/alexandria-ee/
Values: values.yaml (documented inline)
Deployment: templates/deployment.yaml
Config: templates/configmap.yaml
NOTES: templates/NOTES.txt (post-install instructions)

Overview​

Pod Architecture​

Containers in One Pod​

IPC and Shared Storage​

Helm Chart Structure​

Key Configuration Sections​

Replica Count & Scaling​

Container Images​

Persistence​

Alexandria Configuration​

Authentication Secrets​

Federation (OIDC/SAML)​

LLM Backend Registration​

Resource Requests/Limits​

Deployment Lifecycle​

Pre-Deployment​

Helm Install​

Init Containers​

Startup Sequence​

Shutdown Sequence​

Service & Networking​

Service​

Ingress​

Network Policy​

Health Checks​

Liveness Probe​

Readiness Probe​

Dashboard Probe​

Observability​

OpenTelemetry​

Prometheus Metrics​

Logs​

Advanced Configuration​

Horizontal Pod Autoscaler​

Pod Disruption Budget​

External Secrets Operator​

Troubleshooting​

Pod won't start​

API not responding​

Orchestrator communication error​

Permission denied on PVC​

Production Deployment Checklist​

Kustomize Manifests​

References​