Deploying your first Agent

Deploy a custom container as a Krypton Agent.

This guide deploys a custom agent into a running Krypton cluster, verifies the gateway route, and shows where scaling signals come from. Krypton treats your container as a black box that speaks A2A, MCP, or plain HTTP.

Ports & endpoints — the two-minute mental model

Krypton exposes two HTTP services. Talk to the right one:

Port	Service	What lives here
8080	Gateway	All agent traffic. Anything under `/v1/agents/{ns}/{name}/*` is reverse-proxied to your pod — the protocol RPC at `/`, the A2A card at `/.well-known/agent-card.json`, OAuth callbacks at `/oauth/...`, MCP SSE streams, anything. The gateway strips the `/v1/agents/{ns}/{name}` prefix; your container sees the original sub-path.
8090	Control plane	Web UI and introspection APIs (`/v1/agents`, `/v1/agents/{ns}/{name}/mcp/tools`, `/v1/agents/{ns}/{name}/status`). Operator tooling only — never the path your clients use to invoke an agent.

Rule of thumb: if it’s a normal A2A / MCP / HTTP client, point it at :8080. If it’s a browser or kubectl-adjacent tool, :8090.

1. Pick a starting point

Four ready-to-build A2A agents live under examples/agent. Start with the no-LLM one to confirm your install works, then graduate to a framework example when you’re ready to plug in your own logic.

Path	LLM?	Framework	Source
`examples/agent/python/helloworld`	—	Bare `a2a-sdk`	a2a-samples/helloworld
`examples/agent/go`	Gemini	Google ADK-Go	adk.dev quickstart
`examples/agent/python/adk`	Gemini	Google ADK (Python)	a2a-samples/adk_facts
`examples/agent/python/langgraph`	Gemini	LangGraph	a2a-samples/langgraph

The helloworld agent needs no API key and no Secret — it just echoes whatever the client sends.

The container contract is the same for all of them:

Listen on an HTTP port (spec.port, defaults to 8080)
Serve A2A JSON-RPC at spec.invocationPath (defaults to /)
Expose the agent card at /.well-known/agent-card.json

2. Build and load

Each example has its own Dockerfile and agent.yaml. The ADK Python flavour, for instance:

docker build -f examples/agent/python/adk/Dockerfile \
  -t krypton/adk-agent:dev examples/agent/python/adk
kind load docker-image --name krypton-dev krypton/adk-agent:dev

The two LLM-backed samples (adk, langgraph) need a GOOGLE_API_KEY Secret in the agents namespace — see each example’s README for the exact kubectl create secret call.

3. Apply the Agent CR

kubectl apply -f examples/agent/python/adk/agent.yaml

The shipped agent.yaml files look like this (ADK Python shown):

apiVersion: krypton.ai/v1alpha1
kind: Agent
metadata:
  name: adk
  namespace: agents
spec:
  image: krypton/adk-agent:dev
  imagePullPolicy: IfNotPresent
  runtime: python
  framework: google-adk
  protocol: a2a
  mode: always-on
  minReplicas: 1
  maxReplicas: 3
  concurrency: 4
  port: 8080
  invocationPath: /
  env:
    - name: GOOGLE_API_KEY
      valueFrom:
        secretKeyRef: { name: adk-secrets, key: GOOGLE_API_KEY }

4. Invoke

A2A agents expose their card at /.well-known/agent-card.json and accept JSON-RPC message/send calls at the invocation path. Through the gateway:

# Discover the agent card
curl http://localhost:8080/v1/agents/agents/adk/.well-known/agent-card.json

# Send a message
curl -X POST http://localhost:8080/v1/agents/agents/adk/ \
     -H 'Content-Type: application/json' \
     -d '{
       "jsonrpc": "2.0",
       "id": "1",
       "method": "message/send",
       "params":{"message":{"messageId":"1","role":"user",
           "parts": [{"kind": "text", "text": "Tell me a fun fact about octopuses."}]
         }
       }
     }'

Always-on agents keep at least minReplicas pods warm — every call is served immediately by an existing pod.

Routing under the hood

%%{init: {"theme": "base", "flowchart": {"nodeSpacing": 60, "rankSpacing": 70, "diagramPadding": 24}, "themeVariables": {"fontFamily": "Inter, ui-sans-serif, system-ui, sans-serif", "primaryColor": "#eef2ff", "primaryTextColor": "#1f2937", "primaryBorderColor": "#6366f1", "lineColor": "#64748b", "secondaryColor": "#ecfeff", "tertiaryColor": "#f8fafc"}}}%%
flowchart LR
    client["Client"] --> gateway["Krypton gateway"]
    gateway --> proxy["krypton-proxy"]
    proxy --> container["Your container"]
    scaler["Scaler"] -. "in-flight" .-> proxy
    scaler --> status["Desired replicas"]

    classDef external fill:#f8fafc,stroke:#94a3b8,color:#0f172a;
    classDef traffic fill:#ecfeff,stroke:#0891b2,color:#164e63;
    classDef runtime fill:#f0fdf4,stroke:#16a34a,color:#14532d;
    classDef control fill:#eef2ff,stroke:#6366f1,color:#312e81;
    class client external;
    class gateway,proxy traffic;
    class container runtime;
    class scaler,status control;

The sidecar enforces spec.concurrency per pod; over the cap returns 503 + Retry-After. The scaler in the manager observes inflight per pod, computes ceil(inflight / concurrency), and writes status.desiredReplicas — the reconciler scales the Deployment to match.

What’s next

Agent CRD reference — every spec field
Metrics — what the runtime exposes
Components — what’s running where

Last modified May 27, 2026: Refine docs structure and README (bbcd2cf)