Runbook: Tool-Approval Gating — End-to-End Smoke
When to use this
Use this runbook to validate the per-call human approval gate for MCP tool
calls end to end after a deploy. It exercises the full flow: agent token
mint with approval_required_tools, dispatch through POST /mcp/agents/{name},
the approve and reject branches, idempotency cache, Prometheus gauges, and
the Alertmanager notification path.
The feature ships across these merged commits:
9ce81b8— schema + agent store forapproval_required_tools487d328— idempotency-keyedpermission_requests+ execution cache6b55f0e— bakeApprovalRequiredToolsinto agent JWT1f46157— Alertmanager notifier + composite + reconciler + admin API1288959— MCP dispatcher per-call approval gate + Prometheus gauges + NotifyPending5d82a68— frontend Notifications settings (URLs + severity + test button)
Prereqs
-
kubectl context must point at the EE cluster:
kubectl config current-context# expect: gke_<project>_<region>_alexandria-... -
Port-forward the API service. Use the
alex-pfskill or run directly. (Stale port-forwards silently break after pod redeploys — kill them first.)pkill -f "kubectl.*port-forward.*alexandria" || truekubectl -n alexandria port-forward svc/alexandria-ee 8080:80 &Local base URL is then
http://127.0.0.1:8080. -
Admin password. Read from the bootstrap secret (do not check it into logs or paste it into chat):
kubectl -n alexandria get secret alexandria-ee-auth \-o jsonpath='{.data.admin-password}' | base64 -dExport it locally for the rest of the runbook:
export ALEX_PASS='<paste-password>'export ALEX_URL='http://127.0.0.1:8080' -
A real MCP server must be registered and an agent must exist whose
allowed_toolsinclude the tool we will gate. Examples in this runbook useagent=demo-agentandtool=write_file; substitute the agent / tool names that fit your deployment.
Step 1 — Login (super_admin access token)
ACCESS_TOKEN=$(curl -sS -X POST "$ALEX_URL/auth/login" \
-H 'Content-Type: application/json' \
-d "$(jq -nc --arg p "$ALEX_PASS" '{username:"admin", password:$p}')" \
| jq -r '.access_token')
echo "${ACCESS_TOKEN:0:20}..."
Response shape (full body):
{
"access_token": "eyJ...",
"refresh_token": "...",
"token_type": "Bearer",
"expires_in": 900
}
access_token must be a super_admin token — approve/reject and the
approval-required-tools route both require role super_admin.
Step 2 — Set approval_required_tools on the agent
The dedicated admin route lives on main:
PUT /admin/agents/{name}/approval-required-tools
It accepts { "tools": [...] }, replaces the list, bumps
permissions_version (so any existing agent JWT is invalidated and must be
re-minted), and returns the updated agent DTO. Super_admin only.
curl -sS -X PUT "$ALEX_URL/admin/agents/demo-agent/approval-required-tools" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H 'Content-Type: application/json' \
-d '{"tools": ["write_file"]}' | jq
Expect HTTP 200 with the agent DTO; approval_required_tools in the
response body should equal ["write_file"].
If you are running against a build that pre-dates the route landing, mint a fresh agent and pass
approval_required_toolsto the create route, or use the SQL fallback documented under the feature MR. The dedicatedPUTroute is the supported path going forward.
Step 3 — Mint an agent token
AGENT_TOKEN=$(curl -sS -X POST "$ALEX_URL/v1/agent-token" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H 'Content-Type: application/json' \
-d '{"agent": "demo-agent"}' \
| jq -r '.token')
echo "${AGENT_TOKEN:0:20}..."
Response shape:
{
"token": "eyJ...",
"agent": "demo-agent",
"effective_tools": ["write_file", "..."],
"expires_in": 3600
}
The agent JWT bakes in both effective_tools and approval_required_tools
at mint time (see api-go/internal/routes/auth.go::handleAgentToken). If
the agent token was minted before Step 2, it carries the old approval
list — always mint after changing policy.
Step 4 — Dispatch a gated tools/call
The MCP dispatch endpoint is POST /mcp/agents/{name} (agent-scoped) or
POST /mcp (unscoped). Both extract X-Idempotency-Key from the request
header and stash it on the Caller struct (see
api-go/internal/routes/mcp_routes.go::handleMCPAgentDispatch and
internal/mcp/dispatch.go::Caller.IdempotencyKey). For the gating flow we
use a stable, caller-chosen UUID so a retry hits the same row.
IDEMPOTENCY_KEY=$(uuidgen)
RESP1=$(curl -sS -X POST "$ALEX_URL/mcp/agents/demo-agent" \
-H "Authorization: Bearer $AGENT_TOKEN" \
-H "X-Idempotency-Key: $IDEMPOTENCY_KEY" \
-H 'Content-Type: application/json' \
-d '{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "write_file",
"arguments": {"path": "/tmp/smoke.txt", "content": "hello"}
}
}')
echo "$RESP1" | jq
REQUEST_ID=$(echo "$RESP1" | jq -r '.result.request_id')
echo "REQUEST_ID=$REQUEST_ID"
Expected response body (after the 30s in-band poll cap expires):
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"pending": true,
"request_id": "<uuid>",
"retry_after_ms": 2000
}
}
The dispatcher creates one permission_requests row keyed by the
idempotency key (store.UpsertPendingPermissionRequest) and waits up to
30s for a terminal status before returning the pending envelope (see
internal/mcp/dispatch.go::handleApprovalGate and pollApproval). It
also fires NotifyPending against the configured notifier (webhook +/or
Alertmanager) and increments the created event counter.
Step 5 — Approve the request
In a new shell (or after the pending response returns), approve as super_admin:
curl -sS -X POST "$ALEX_URL/admin/permissions/$REQUEST_ID/approve" \
-H "Authorization: Bearer $ACCESS_TOKEN" | jq
Response is the full permission request DTO with "status": "approved",
reviewed_by, and reviewed_at set. The approve handler decrements the
pending gauge and emits an approved event (see
routes/permissions_routes.go::handleApprovePermissionRequest).
Step 6 — Retry with the same idempotency key
Repeat the exact same dispatch from Step 4 (same X-Idempotency-Key):
RESP2=$(curl -sS -X POST "$ALEX_URL/mcp/agents/demo-agent" \
-H "Authorization: Bearer $AGENT_TOKEN" \
-H "X-Idempotency-Key: $IDEMPOTENCY_KEY" \
-H 'Content-Type: application/json' \
-d '{
"jsonrpc": "2.0",
"id": 2,
"method": "tools/call",
"params": {
"name": "write_file",
"arguments": {"path": "/tmp/smoke.txt", "content": "hello"}
}
}')
echo "$RESP2" | jq
Expected: the dispatcher reads the row, sees status=approved, wins or
loses the at-most-once execute race in
MarkPermissionRequestExecuted, then either:
- executes the underlying MCP
tools/call, caches the raw result on the row, and returns the actual tool result; or - if a concurrent call already executed, returns the cached result from
SetPermissionRequestResult(seeexecuteApprovedininternal/mcp/dispatch.go).
Either way RESP2.result is the real tool result — not a pending
envelope.
Step 7 — Metrics check
/metrics is public and unauthenticated (router.go line r.Get("/metrics", handleMetrics())).
curl -sS "$ALEX_URL/metrics" | grep -E '^alexandria_permission_request' | sort
Expected after one full create → approve → execute cycle:
alexandria_permission_request_pending{request_type="tool_call",tool="write_file"} 0alexandria_permission_request_events_total{request_type="tool_call",tool="write_file",event="created"} >= 1alexandria_permission_request_events_total{request_type="tool_call",tool="write_file",event="approved"} >= 1
Live trajectory of the pending gauge during the run: 0 → 1 on Step 4,
back to 0 on Step 5 (decrement happens in the approve handler) and
again confirmed on Step 6 by the executor path. The two events_total
counters increment by 1 each per cycle. Metric and label names come from
internal/mcp/dispatch.go::PermissionRequestPending and
PermissionRequestEvents.
Step 8 — Reject path
Use a fresh idempotency key for a new request so the row is independent from Steps 4–6:
IDEMPOTENCY_KEY=$(uuidgen)
RESP3=$(curl -sS -X POST "$ALEX_URL/mcp/agents/demo-agent" \
-H "Authorization: Bearer $AGENT_TOKEN" \
-H "X-Idempotency-Key: $IDEMPOTENCY_KEY" \
-H 'Content-Type: application/json' \
-d '{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "write_file",
"arguments": {"path": "/tmp/smoke.txt", "content": "reject me"}
}
}')
REQUEST_ID=$(echo "$RESP3" | jq -r '.result.request_id')
curl -sS -X POST "$ALEX_URL/admin/permissions/$REQUEST_ID/reject" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H 'Content-Type: application/json' \
-d '{"reason": "smoke test rejection"}' | jq
# Retry the call.
curl -sS -X POST "$ALEX_URL/mcp/agents/demo-agent" \
-H "Authorization: Bearer $AGENT_TOKEN" \
-H "X-Idempotency-Key: $IDEMPOTENCY_KEY" \
-H 'Content-Type: application/json' \
-d '{
"jsonrpc": "2.0",
"id": 4,
"method": "tools/call",
"params": {
"name": "write_file",
"arguments": {"path": "/tmp/smoke.txt", "content": "reject me"}
}
}' | jq
Expected on the retry:
{
"jsonrpc": "2.0",
"id": 4,
"error": {
"code": -32003,
"message": "denied: smoke test rejection"
}
}
-32003 is ErrAccessDenied from internal/mcp/dispatch.go. The reason
string is read out of notifier_meta via extractReason.
/metrics should now also show
alexandria_permission_request_events_total{...event="rejected"} >= 1
and the pending gauge should still be 0.
Step 9 — Alertmanager notification
Configure the Alertmanager endpoint from the Settings → Notifications UI or via the admin API directly:
curl -sS -X PATCH "$ALEX_URL/admin/notifications/config" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H 'Content-Type: application/json' \
-d '{
"alertmanager_url": "http://alertmanager.monitoring.svc:9093",
"alertmanager_severity": "warning"
}' | jq
Fire a synthetic alert and resolved-alert pair:
curl -sS -X POST "$ALEX_URL/admin/notifications/test" \
-H "Authorization: Bearer $ACCESS_TOKEN" | jq
Expected response {"ok": true}. The handler calls NotifyPending and
then NotifyReviewed against the active composite notifier
(routes/notifications_routes.go::handleTestNotifications), so a properly
configured Alertmanager will see a firing alert immediately followed by a
resolved alert. Both URL save and test live behind the
super_admin role check.
Frontend equivalent: log in as super_admin, open Settings →
Notifications, paste the Alertmanager URL, click Save, then click
Test. Frontend API client wraps the same endpoints
(getNotificationsConfig, patchNotificationsConfig,
testNotifications in frontend/src/api.ts).
Cleanup
-
Reset the approval list back to empty so the agent does not require approval for normal operation:
curl -sS -X PUT "$ALEX_URL/admin/agents/demo-agent/approval-required-tools" \-H "Authorization: Bearer $ACCESS_TOKEN" \-H 'Content-Type: application/json' \-d '{"tools": []}' | jq -
pkill -f "kubectl.*port-forward.*alexandria"if you want to drop the port-forward.
Placeholders used in this runbook
<ACCESS_TOKEN>— admin access token from Step 1 ($ACCESS_TOKEN).<AGENT_TOKEN>— agent JWT from Step 3 ($AGENT_TOKEN).<REQUEST_ID>—result.request_idfrom the pending response ($REQUEST_ID).<IDEMPOTENCY_KEY>— caller-chosen UUID, stable across retries ($IDEMPOTENCY_KEY).
Related routes (verified against routes/router.go on main)
| Method | Path | Auth | Source |
|---|---|---|---|
POST | /auth/login | none | routes/auth.go::handleLogin |
POST | /v1/agent-token | human access | routes/auth.go::handleAgentToken |
POST | /mcp/agents/{name} | bearer (agent or user) | routes/mcp_routes.go::handleMCPAgentDispatch |
POST | /mcp | bearer | routes/mcp_routes.go::handleMCPDispatch |
PUT | /admin/agents/{name}/approval-required-tools | super_admin | routes/agents.go::handleSetApprovalRequiredTools |
GET | /admin/permissions[?status=pending] | admin | routes/permissions_routes.go::handleListPermissionRequests |
GET | /admin/permissions/{id} | admin | routes/permissions_routes.go::handleGetPermissionRequest |
POST | /admin/permissions/{id}/approve | admin | routes/permissions_routes.go::handleApprovePermissionRequest |
POST | /admin/permissions/{id}/reject | admin | routes/permissions_routes.go::handleRejectPermissionRequest |
PATCH | /admin/notifications/config | super_admin | routes/notifications_routes.go::handlePatchNotificationsConfig |
POST | /admin/notifications/test | super_admin | routes/notifications_routes.go::handleTestNotifications |
GET | /metrics | none | routes/metrics.go::handleMetrics |
Verified flow logically against code in commit 78fd009e36281d90eb43e2162c47a98791e4ddf4; end-to-end live test pending.