Metrics

Prometheus metrics exposed by every component, plus Prometheus Operator setup.

Every Krypton component emits Prometheus metrics. The chart ships ServiceMonitor + PodMonitor resources gated behind opt-in flags so the integration is one --set away.

Quick setup with kube-prometheus-stack

If you don’t already have Prometheus Operator on the cluster:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prom prometheus-community/kube-prometheus-stack \
  --namespace monitoring --create-namespace

Then enable Krypton’s monitors:

helm upgrade --install krypton oci://ghcr.io/kryptonhq/charts/krypton \
  --namespace krypton-system --create-namespace \
  --set serviceMonitor.enabled=true \
  --set podMonitor.enabled=true \
  --set serviceMonitor.labels.release=prom \
  --set podMonitor.labels.release=prom

The labels.release=prom matches kube-prometheus-stack’s default serviceMonitorSelector (replace prom with your stack’s release name). Without it, the operator’s Prometheus instance ignores the new monitors.

Verify scrape targets are up:

kubectl -n monitoring port-forward svc/prom-kube-prometheus-stack-prometheus 9090:9090
open http://localhost:9090/targets

You should see jobs for each Krypton component plus one for the sidecar across every agent pod.

Series

Gateway

MetricTypeLabels
krypton_invocations_totalcounteragent, namespace, status
krypton_invocation_duration_secondshistogramagent, namespace

Histogram buckets: 25ms, 50ms, 100ms, 250ms, 500ms, 1s, 2.5s, 5s, 10s, 30s.

Model serving (gateway)

MetricTypeLabels
krypton_model_invocations_totalcountermodel, status
krypton_model_invocation_duration_secondshistogrammodel

Emitted for /v1/chat/completions, /v1/completions, and /v1/embeddings invocations routed via the Model CRD. Buckets stretch out to 120s for long generations: 50ms, 100ms, 250ms, 500ms, 1s, 2.5s, 5s, 10s, 30s, 60s, 120s.

Scaler (hosted in manager)

MetricTypeLabels
krypton_scaler_decisions_totalcounteragent, namespace, direction (up/down/noop)
krypton_agent_replicas_desiredgaugeagent, namespace

Control plane

MetricTypeLabels
krypton_api_requests_totalcounterroute, method, code
krypton_api_request_duration_secondshistogramroute

route is a route template (list_agents, get_agent, …), not the raw URL, so cardinality stays bounded.

Sidecar (krypton-proxy)

MetricTypeLabels
krypton_proxy_requests_totalcounteragent, namespace, code
krypton_proxy_rejected_totalcounteragent, namespace, reason
krypton_proxy_inflightgaugeagent, namespace

reason is one of over_capacity (concurrency cap) or shutting_down.

Grafana dashboard

A starter dashboard lives at deploy/grafana/krypton-overview.json.

Import it

  1. Grafana → Dashboards → New → Import
  2. Upload the JSON file (or paste it).
  3. Grafana prompts for a value for the DS_PROM template input — that’s where you pick your Prometheus datasource. Select the Prometheus that the operator created (default name when installed via kube-prometheus-stack: Prometheus).
  4. Click Import.

If the dashboard renders empty / “Datasource DS_PROM was not found”

Means the dashboard JSON references a datasource by its variable name but Grafana didn’t resolve it at import time. Two fixes:

  • Re-import: delete the dashboard, New → Import, pay attention to the DS_PROM field at the bottom of the dialog and explicitly pick your Prometheus datasource before clicking Import.
  • Or edit the panel: on a broken panel, click ⋯ → Edit → bottom of the page switch the Data source dropdown to your Prometheus datasource → Save. Repeat per panel, or use the dashboard JSON model editor to swap all panels at once.

If you imported the file via a JSON kind: ConfigMap (Grafana sidecar), set the datasource UID explicitly in the JSON before applying:

PROM_UID=$(kubectl -n monitoring get datasource prometheus -o jsonpath='{.spec.uid}' 2>/dev/null \
            || echo "prometheus")
sed -i "s/\${DS_PROM}/${PROM_UID}/g" deploy/grafana/krypton-overview.json
kubectl create configmap krypton-overview \
  --from-file=deploy/grafana/krypton-overview.json \
  --namespace monitoring \
  --dry-run=client -o yaml | \
  kubectl label --local -f - grafana_dashboard=1 --dry-run=client -o yaml | \
  kubectl apply -f -

Panels included

  • Invocations per second (per agent)
  • P95 invocation latency
  • Desired replicas
  • Scaling decisions per minute (by direction)
  • Sidecar in-flight (per pod)
  • Sidecar rejected (over_capacity vs shutting_down)

Direct port-forward (no Prometheus Operator)

If you’re not running the operator but want to eyeball metrics during local testing:

# Gateway
kubectl -n krypton-system port-forward deploy/krypton-gateway 8081:8081
curl http://localhost:8081/metrics

# Control plane
kubectl -n krypton-system port-forward deploy/krypton-control-plane 8091:8091
curl http://localhost:8091/metrics

# Manager (controller-runtime + scaler)
kubectl -n krypton-system port-forward deploy/krypton-manager 8080:8080
curl http://localhost:8080/metrics

# Sidecar in an agent pod
kubectl -n agents port-forward deploy/<agent> 8888:8888 -c krypton-proxy
curl http://localhost:8888/metrics
Last modified May 27, 2026: Refine docs structure and README (bbcd2cf)