Deploying your first Agent

Deploy a custom container as a Krypton Agent.

This guide deploys a custom agent into a running Krypton cluster, verifies the gateway route, and shows where scaling signals come from. Krypton treats your container as a black box that speaks A2A, MCP, or plain HTTP.

Ports & endpoints — the two-minute mental model

Krypton exposes two HTTP services. Talk to the right one:

PortServiceWhat lives here
8080GatewayAll agent traffic. Anything under /v1/agents/{ns}/{name}/* is reverse-proxied to your pod — the protocol RPC at /, the A2A card at /.well-known/agent-card.json, OAuth callbacks at /oauth/..., MCP SSE streams, anything. The gateway strips the /v1/agents/{ns}/{name} prefix; your container sees the original sub-path.
8090Control planeWeb UI and introspection APIs (/v1/agents, /v1/agents/{ns}/{name}/mcp/tools, /v1/agents/{ns}/{name}/status). Operator tooling only — never the path your clients use to invoke an agent.

Rule of thumb: if it’s a normal A2A / MCP / HTTP client, point it at :8080. If it’s a browser or kubectl-adjacent tool, :8090.

1. Pick a starting point

Four ready-to-build A2A agents live under examples/agent. Start with the no-LLM one to confirm your install works, then graduate to a framework example when you’re ready to plug in your own logic.

PathLLM?FrameworkSource
examples/agent/python/helloworldBare a2a-sdka2a-samples/helloworld
examples/agent/goGeminiGoogle ADK-Goadk.dev quickstart
examples/agent/python/adkGeminiGoogle ADK (Python)a2a-samples/adk_facts
examples/agent/python/langgraphGeminiLangGrapha2a-samples/langgraph

The helloworld agent needs no API key and no Secret — it just echoes whatever the client sends.

The container contract is the same for all of them:

  • Listen on an HTTP port (spec.port, defaults to 8080)
  • Serve A2A JSON-RPC at spec.invocationPath (defaults to /)
  • Expose the agent card at /.well-known/agent-card.json

2. Build and load

Each example has its own Dockerfile and agent.yaml. The ADK Python flavour, for instance:

docker build -f examples/agent/python/adk/Dockerfile \
  -t krypton/adk-agent:dev examples/agent/python/adk
kind load docker-image --name krypton-dev krypton/adk-agent:dev

The two LLM-backed samples (adk, langgraph) need a GOOGLE_API_KEY Secret in the agents namespace — see each example’s README for the exact kubectl create secret call.

3. Apply the Agent CR

kubectl apply -f examples/agent/python/adk/agent.yaml

The shipped agent.yaml files look like this (ADK Python shown):

apiVersion: krypton.ai/v1alpha1
kind: Agent
metadata:
  name: adk
  namespace: agents
spec:
  image: krypton/adk-agent:dev
  imagePullPolicy: IfNotPresent
  runtime: python
  framework: google-adk
  protocol: a2a
  mode: always-on
  minReplicas: 1
  maxReplicas: 3
  concurrency: 4
  port: 8080
  invocationPath: /
  env:
    - name: GOOGLE_API_KEY
      valueFrom:
        secretKeyRef: { name: adk-secrets, key: GOOGLE_API_KEY }

4. Invoke

A2A agents expose their card at /.well-known/agent-card.json and accept JSON-RPC message/send calls at the invocation path. Through the gateway:

# Discover the agent card
curl http://localhost:8080/v1/agents/agents/adk/.well-known/agent-card.json

# Send a message
curl -X POST http://localhost:8080/v1/agents/agents/adk/ \
     -H 'Content-Type: application/json' \
     -d '{
       "jsonrpc": "2.0",
       "id": "1",
       "method": "message/send",
       "params":{"message":{"messageId":"1","role":"user",
           "parts": [{"kind": "text", "text": "Tell me a fun fact about octopuses."}]
         }
       }
     }'

Always-on agents keep at least minReplicas pods warm — every call is served immediately by an existing pod.

Routing under the hood

%%{init: {"theme": "base", "flowchart": {"nodeSpacing": 60, "rankSpacing": 70, "diagramPadding": 24}, "themeVariables": {"fontFamily": "Inter, ui-sans-serif, system-ui, sans-serif", "primaryColor": "#eef2ff", "primaryTextColor": "#1f2937", "primaryBorderColor": "#6366f1", "lineColor": "#64748b", "secondaryColor": "#ecfeff", "tertiaryColor": "#f8fafc"}}}%%
flowchart LR
    client["Client"] --> gateway["Krypton gateway"]
    gateway --> proxy["krypton-proxy"]
    proxy --> container["Your container"]
    scaler["Scaler"] -. "in-flight" .-> proxy
    scaler --> status["Desired replicas"]

    classDef external fill:#f8fafc,stroke:#94a3b8,color:#0f172a;
    classDef traffic fill:#ecfeff,stroke:#0891b2,color:#164e63;
    classDef runtime fill:#f0fdf4,stroke:#16a34a,color:#14532d;
    classDef control fill:#eef2ff,stroke:#6366f1,color:#312e81;
    class client external;
    class gateway,proxy traffic;
    class container runtime;
    class scaler,status control;

The sidecar enforces spec.concurrency per pod; over the cap returns 503 + Retry-After. The scaler in the manager observes inflight per pod, computes ceil(inflight / concurrency), and writes status.desiredReplicas — the reconciler scales the Deployment to match.

What’s next

Last modified May 27, 2026: Refine docs structure and README (bbcd2cf)