Concepts

The mental model for Krypton’s resources, routing, and runtime components.

Start here when you want to understand how Krypton fits together before changing production settings.

Read in order

  1. Architecture explains the resource model, the four runtime components, and where state lives.
  2. Components goes deeper on the manager, control plane, gateway, sidecar, scaler, and model controller.
  3. Request lifecycle follows a request from the client through the gateway, sidecar, user container, scaler, and back.

Core ideas

IdeaWhy it matters
Kubernetes is the source of truthAgent and Model resources describe desired state; controllers own Deployments, Services, and status.
Gateway traffic is separate from operator trafficClients call the gateway on :8080; the control plane UI and introspection APIs live on :8090.
Workloads stay ordinary containersKrypton adds routing, lifecycle, scaling signals, and observability around your container without requiring a framework rewrite.
Models use OpenAI-compatible pathsApplications keep the familiar /v1/models and /v1/chat/completions API shape while operators manage in-cluster model pods.

Architecture

Components and state model.

Components

Per-component deep dive.

Request lifecycle

What happens between curl and JSON.

Last modified May 27, 2026: Refine docs structure and README (bbcd2cf)