Components
Manager
Source: cmd/manager +
internal/controller.
Standard controller-runtime operator. Reconciles Agent CRs by owning
three child resources per agent: a Deployment, a Service, and a
ServiceAccount. The Deployment is injected with the krypton-proxy
sidecar at template-render time.
Also runs the scaling decider as a manager.Runnable.
Key behavior:
CreateOrUpdatesemantics: each child resource usescontrollerutil.CreateOrUpdatewrapped inretry.RetryOnConflictso spec drift converges without hot-looping when the apps controller concurrently updates Deployment status.- Status writes use
PatchwithMergeFrom, notUpdate, so they don’t conflict with the scaler/gateway’s concurrent writes to other status fields. - Finalizer
krypton.ai/cleanupblocks deletion until child resources have drained.
Control plane
Source: cmd/control-plane +
internal/controlplane.
A controller-runtime manager with no reconcilers — just a cache that
watches Agent CRs across the cluster. Serves the public REST API from
that cache (always fresh, no DB hop):
| Path | Returns |
|---|---|
GET /v1/agents[?namespace=...] | List agents |
GET /v1/agents/{namespace}/{name} | Single agent |
GET /v1/agents/{namespace}/{name}/status | Just the status subresource |
GET /healthz, /readyz | Probes |
GET /ui/* | Embedded React UI |
When --database-url (or $DATABASE_URL) is set, an additional
Syncer Reconciler mirrors each Agent CR into Postgres on every event.
Gateway
Source: cmd/gateway +
internal/gateway.
Public ingress. Any request to /v1/agents/{namespace}/{name}[/...]
is reverse-proxied to the agent’s in-cluster Service via
httputil.ReverseProxy with FlushInterval = -1 — SSE / chunked HTTP
arrive at the client as the upstream produces them, not at EOF.
The gateway strips exactly /v1/agents/{namespace}/{name} and
forwards the rest of the path verbatim. Agents see /,
/.well-known/agent-card.json, /oauth/callback, or whatever else
they implement — no knowledge of the gateway prefix required.
After each successful invocation, the gateway asynchronously patches
status.lastInvocationAt (decoupled from the request context via
context.WithoutCancel so the patch survives client disconnect).
Scaler
Source: internal/scaler.
Hosted by the manager process. Ticks every --scaler-interval-ms
(default 1s) and for each Agent:
- Queries each ready pod IP from
Endpointsfor its sidecar’s/_krypton/inflightcount, sums them - Computes
desired = clamp(ceil(inflight / concurrency), min, max) - Always-on floor:
max(minReplicas, 1)— never scales below this - Hysteresis: refuses to scale down within
--scaler-stable-window-ms(default 60s) of the most recent scale-up. Prevents flapping under bursty load.
Sidecar (krypton-proxy)
Source: cmd/krypton-proxy +
internal/sidecar.
Injected next to every Agent container. Listens on port 8888,
forwards to the user container on spec.port.
| Endpoint | Purpose |
|---|---|
/healthz | Always 200 (liveness) |
/readyz | 200 normally; 503 during graceful shutdown |
/metrics | Prometheus — krypton_proxy_requests_total, krypton_proxy_inflight, krypton_proxy_rejected_total |
/_krypton/inflight | JSON: in-flight count, last-activity ns, concurrency cap |
| anything else | Reverse-proxied to user container |
Concurrency is enforced via a non-blocking semaphore — over the cap
returns 503 + Retry-After immediately. Graceful shutdown drains
in-flight requests up to KRYPTON_SHUTDOWN_TIMEOUT (default 25s).
The Service routes external traffic to the sidecar port (TargetPort = proxy), not directly to the user container.