Deploying your first Agent
This guide deploys a custom agent into a running Krypton cluster, verifies the gateway route, and shows where scaling signals come from. Krypton treats your container as a black box that speaks A2A, MCP, or plain HTTP.
Ports & endpoints — the two-minute mental model
Krypton exposes two HTTP services. Talk to the right one:
| Port | Service | What lives here |
|---|---|---|
| 8080 | Gateway | All agent traffic. Anything under /v1/agents/{ns}/{name}/* is reverse-proxied to your pod — the protocol RPC at /, the A2A card at /.well-known/agent-card.json, OAuth callbacks at /oauth/..., MCP SSE streams, anything. The gateway strips the /v1/agents/{ns}/{name} prefix; your container sees the original sub-path. |
| 8090 | Control plane | Web UI and introspection APIs (/v1/agents, /v1/agents/{ns}/{name}/mcp/tools, /v1/agents/{ns}/{name}/status). Operator tooling only — never the path your clients use to invoke an agent. |
Rule of thumb: if it’s a normal A2A / MCP / HTTP client, point it at
:8080. If it’s a browser or kubectl-adjacent tool, :8090.
1. Pick a starting point
Four ready-to-build A2A agents live under examples/agent.
Start with the no-LLM one to confirm your install works, then graduate
to a framework example when you’re ready to plug in your own logic.
| Path | LLM? | Framework | Source |
|---|---|---|---|
examples/agent/python/helloworld | — | Bare a2a-sdk | a2a-samples/helloworld |
examples/agent/go | Gemini | Google ADK-Go | adk.dev quickstart |
examples/agent/python/adk | Gemini | Google ADK (Python) | a2a-samples/adk_facts |
examples/agent/python/langgraph | Gemini | LangGraph | a2a-samples/langgraph |
The helloworld agent needs no API key and no Secret — it just
echoes whatever the client sends.
The container contract is the same for all of them:
- Listen on an HTTP port (
spec.port, defaults to8080) - Serve A2A JSON-RPC at
spec.invocationPath(defaults to/) - Expose the agent card at
/.well-known/agent-card.json
2. Build and load
Each example has its own Dockerfile and agent.yaml. The ADK Python
flavour, for instance:
docker build -f examples/agent/python/adk/Dockerfile \
-t krypton/adk-agent:dev examples/agent/python/adk
kind load docker-image --name krypton-dev krypton/adk-agent:dev
The two LLM-backed samples (adk, langgraph) need a GOOGLE_API_KEY
Secret in the agents namespace — see each example’s README for the
exact kubectl create secret call.
3. Apply the Agent CR
kubectl apply -f examples/agent/python/adk/agent.yaml
The shipped agent.yaml files look like this (ADK Python shown):
apiVersion: krypton.ai/v1alpha1
kind: Agent
metadata:
name: adk
namespace: agents
spec:
image: krypton/adk-agent:dev
imagePullPolicy: IfNotPresent
runtime: python
framework: google-adk
protocol: a2a
mode: always-on
minReplicas: 1
maxReplicas: 3
concurrency: 4
port: 8080
invocationPath: /
env:
- name: GOOGLE_API_KEY
valueFrom:
secretKeyRef: { name: adk-secrets, key: GOOGLE_API_KEY }
4. Invoke
A2A agents expose their card at /.well-known/agent-card.json and accept
JSON-RPC message/send calls at the invocation path. Through the gateway:
# Discover the agent card
curl http://localhost:8080/v1/agents/agents/adk/.well-known/agent-card.json
# Send a message
curl -X POST http://localhost:8080/v1/agents/agents/adk/ \
-H 'Content-Type: application/json' \
-d '{
"jsonrpc": "2.0",
"id": "1",
"method": "message/send",
"params":{"message":{"messageId":"1","role":"user",
"parts": [{"kind": "text", "text": "Tell me a fun fact about octopuses."}]
}
}
}'
Always-on agents keep at least minReplicas pods warm — every call is
served immediately by an existing pod.
Routing under the hood
%%{init: {"theme": "base", "flowchart": {"nodeSpacing": 60, "rankSpacing": 70, "diagramPadding": 24}, "themeVariables": {"fontFamily": "Inter, ui-sans-serif, system-ui, sans-serif", "primaryColor": "#eef2ff", "primaryTextColor": "#1f2937", "primaryBorderColor": "#6366f1", "lineColor": "#64748b", "secondaryColor": "#ecfeff", "tertiaryColor": "#f8fafc"}}}%%
flowchart LR
client["Client"] --> gateway["Krypton gateway"]
gateway --> proxy["krypton-proxy"]
proxy --> container["Your container"]
scaler["Scaler"] -. "in-flight" .-> proxy
scaler --> status["Desired replicas"]
classDef external fill:#f8fafc,stroke:#94a3b8,color:#0f172a;
classDef traffic fill:#ecfeff,stroke:#0891b2,color:#164e63;
classDef runtime fill:#f0fdf4,stroke:#16a34a,color:#14532d;
classDef control fill:#eef2ff,stroke:#6366f1,color:#312e81;
class client external;
class gateway,proxy traffic;
class container runtime;
class scaler,status control;The sidecar enforces spec.concurrency per pod; over the cap returns
503 + Retry-After. The scaler in the manager observes inflight per pod,
computes ceil(inflight / concurrency), and writes
status.desiredReplicas — the reconciler scales the Deployment to match.
What’s next
- Agent CRD reference — every spec field
- Metrics — what the runtime exposes
- Components — what’s running where