<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Concepts on Krypton Runtime</title><link>https://www.kryptonhq.com/docs/concepts/</link><description>Recent content in Concepts on Krypton Runtime</description><generator>Hugo</generator><language>en-us</language><atom:link href="https://www.kryptonhq.com/docs/concepts/index.xml" rel="self" type="application/rss+xml"/><item><title>Architecture</title><link>https://www.kryptonhq.com/docs/concepts/architecture/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://www.kryptonhq.com/docs/concepts/architecture/</guid><description>&lt;p&gt;Krypton is composed of four binaries:&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Component&lt;/th&gt;
 &lt;th&gt;Role&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;Manager&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;Kubernetes operator. Reconciles &lt;code&gt;Agent&lt;/code&gt; CRs → Deployments + Services + ServiceAccounts. Runs the scaling decider.&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;Control plane&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;Read-only HTTP API + the operator UI. Optionally mirrors agents into Postgres for offline querying.&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;Gateway&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;Public ingress. Reverse-proxies invocations to the agent&amp;rsquo;s in-cluster Service.&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;Sidecar&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;Per-pod &lt;code&gt;krypton-proxy&lt;/code&gt;. Enforces concurrency, surfaces in-flight count, exposes Prometheus metrics.&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="high-level-diagram"&gt;High-level diagram&lt;/h2&gt;
&lt;pre class="mermaid"&gt;%%{init: {&amp;#34;theme&amp;#34;: &amp;#34;base&amp;#34;, &amp;#34;flowchart&amp;#34;: {&amp;#34;nodeSpacing&amp;#34;: 70, &amp;#34;rankSpacing&amp;#34;: 80, &amp;#34;diagramPadding&amp;#34;: 24}, &amp;#34;themeVariables&amp;#34;: {&amp;#34;fontFamily&amp;#34;: &amp;#34;Inter, ui-sans-serif, system-ui, sans-serif&amp;#34;, &amp;#34;primaryColor&amp;#34;: &amp;#34;#eef2ff&amp;#34;, &amp;#34;primaryTextColor&amp;#34;: &amp;#34;#1f2937&amp;#34;, &amp;#34;primaryBorderColor&amp;#34;: &amp;#34;#6366f1&amp;#34;, &amp;#34;lineColor&amp;#34;: &amp;#34;#64748b&amp;#34;, &amp;#34;secondaryColor&amp;#34;: &amp;#34;#ecfeff&amp;#34;, &amp;#34;tertiaryColor&amp;#34;: &amp;#34;#f8fafc&amp;#34;}}}%%
flowchart TB
 client[&amp;#34;Client&amp;#34;]
 ui[&amp;#34;Krypton UI&amp;lt;br/&amp;gt;Operator console&amp;#34;]
 cp[&amp;#34;Control plane&amp;lt;br/&amp;gt;REST API and cache&amp;#34;]
 gw[&amp;#34;Gateway&amp;lt;br/&amp;gt;Ingress and activator&amp;#34;]
 mgr[&amp;#34;Manager&amp;lt;br/&amp;gt;Controller&amp;#34;]
 scaler[&amp;#34;Scaler&amp;lt;br/&amp;gt;Replica decisions&amp;#34;]

 subgraph pod[&amp;#34;Agent pod&amp;#34;]
 proxy[&amp;#34;krypton-proxy&amp;lt;br/&amp;gt;Concurrency and metrics&amp;#34;]
 app[&amp;#34;User agent container&amp;#34;]
 proxy --&amp;gt;|&amp;#34;proxy&amp;#34;| app
 end

 ui --&amp;gt;|&amp;#34;REST&amp;#34;| cp
 client --&amp;gt;|&amp;#34;invoke&amp;#34;| gw
 gw --&amp;gt;|&amp;#34;proxy request&amp;#34;| proxy
 cp --&amp;gt;|&amp;#34;watch&amp;#34;| mgr
 mgr --&amp;gt;|&amp;#34;owns&amp;#34;| pod
 scaler --&amp;gt;|&amp;#34;scale&amp;#34;| pod
 scaler -. &amp;#34;in-flight&amp;#34; .-&amp;gt; proxy

 classDef external fill:#f8fafc,stroke:#94a3b8,color:#0f172a;
 classDef control fill:#eef2ff,stroke:#6366f1,color:#312e81;
 classDef traffic fill:#ecfeff,stroke:#0891b2,color:#164e63;
 classDef runtime fill:#f0fdf4,stroke:#16a34a,color:#14532d;
 classDef podbox fill:#ffffff,stroke:#cbd5e1,color:#0f172a;
 class client external;
 class ui,cp,mgr,scaler control;
 class gw,proxy traffic;
 class app runtime;
 class pod podbox;&lt;/pre&gt;
&lt;h2 id="where-state-lives"&gt;Where state lives&lt;/h2&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;State&lt;/th&gt;
 &lt;th&gt;Source of truth&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Agent desired spec&lt;/td&gt;
 &lt;td&gt;The &lt;code&gt;Agent&lt;/code&gt; CR (Kubernetes etcd)&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;code&gt;status.phase&lt;/code&gt;, &lt;code&gt;replicas&lt;/code&gt;&lt;/td&gt;
 &lt;td&gt;Manager writes; readers consume&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;code&gt;status.desiredReplicas&lt;/code&gt;&lt;/td&gt;
 &lt;td&gt;Scaler (in manager)&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;code&gt;status.lastInvocationAt&lt;/code&gt;&lt;/td&gt;
 &lt;td&gt;Gateway writes after each invocation&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;In-flight count&lt;/td&gt;
 &lt;td&gt;Sidecar&amp;rsquo;s &lt;code&gt;/_krypton/inflight&lt;/code&gt; endpoint&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Invocation history (later)&lt;/td&gt;
 &lt;td&gt;Postgres&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;CRDs are the source of truth.&lt;/strong&gt; Postgres is a write-through mirror —
the API serves directly from the informer cache (fresher, no DB hop).&lt;/p&gt;</description></item><item><title>Components</title><link>https://www.kryptonhq.com/docs/concepts/components/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://www.kryptonhq.com/docs/concepts/components/</guid><description>&lt;h2 id="manager"&gt;Manager&lt;/h2&gt;
&lt;p&gt;Source: &lt;a href="https://github.com/kryptonhq/runtime/tree/main/cmd/manager"&gt;&lt;code&gt;cmd/manager&lt;/code&gt;&lt;/a&gt; +
&lt;a href="https://github.com/kryptonhq/runtime/tree/main/internal/controller"&gt;&lt;code&gt;internal/controller&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Standard controller-runtime operator. Reconciles &lt;code&gt;Agent&lt;/code&gt; CRs by owning
three child resources per agent: a &lt;code&gt;Deployment&lt;/code&gt;, a &lt;code&gt;Service&lt;/code&gt;, and a
&lt;code&gt;ServiceAccount&lt;/code&gt;. The Deployment is injected with the &lt;code&gt;krypton-proxy&lt;/code&gt;
sidecar at template-render time.&lt;/p&gt;
&lt;p&gt;Also runs the &lt;a href="#scaler"&gt;scaling decider&lt;/a&gt; as a &lt;code&gt;manager.Runnable&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Key behavior:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;CreateOrUpdate&lt;/code&gt; semantics&lt;/strong&gt;: each child resource uses
&lt;code&gt;controllerutil.CreateOrUpdate&lt;/code&gt; wrapped in &lt;code&gt;retry.RetryOnConflict&lt;/code&gt; so
spec drift converges without hot-looping when the apps controller
concurrently updates Deployment status.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Status writes use &lt;code&gt;Patch&lt;/code&gt; with &lt;code&gt;MergeFrom&lt;/code&gt;&lt;/strong&gt;, not &lt;code&gt;Update&lt;/code&gt;, so they
don&amp;rsquo;t conflict with the scaler/gateway&amp;rsquo;s concurrent writes to other
status fields.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Finalizer&lt;/strong&gt; &lt;code&gt;krypton.ai/cleanup&lt;/code&gt; blocks deletion until child
resources have drained.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="control-plane"&gt;Control plane&lt;/h2&gt;
&lt;p&gt;Source: &lt;a href="https://github.com/kryptonhq/runtime/tree/main/cmd/control-plane"&gt;&lt;code&gt;cmd/control-plane&lt;/code&gt;&lt;/a&gt; +
&lt;a href="https://github.com/kryptonhq/runtime/tree/main/internal/controlplane"&gt;&lt;code&gt;internal/controlplane&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>Request lifecycle</title><link>https://www.kryptonhq.com/docs/concepts/request-lifecycle/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://www.kryptonhq.com/docs/concepts/request-lifecycle/</guid><description>&lt;p&gt;This page walks through what happens between &lt;code&gt;curl POST .../invocations&lt;/code&gt;
and the JSON coming back.&lt;/p&gt;
&lt;h2 id="hot-path-pod-already-running"&gt;Hot path (pod already running)&lt;/h2&gt;
&lt;pre class="mermaid"&gt;%%{init: {&amp;#34;theme&amp;#34;: &amp;#34;base&amp;#34;, &amp;#34;themeVariables&amp;#34;: {&amp;#34;fontFamily&amp;#34;: &amp;#34;Inter, ui-sans-serif, system-ui, sans-serif&amp;#34;, &amp;#34;primaryColor&amp;#34;: &amp;#34;#eef2ff&amp;#34;, &amp;#34;primaryTextColor&amp;#34;: &amp;#34;#1f2937&amp;#34;, &amp;#34;primaryBorderColor&amp;#34;: &amp;#34;#6366f1&amp;#34;, &amp;#34;lineColor&amp;#34;: &amp;#34;#64748b&amp;#34;, &amp;#34;secondaryColor&amp;#34;: &amp;#34;#ecfeff&amp;#34;, &amp;#34;tertiaryColor&amp;#34;: &amp;#34;#f8fafc&amp;#34;}}}%%
sequenceDiagram
 autonumber
 participant Client
 participant Gateway
 participant Cache as Informer cache
 participant KubeProxy as kube-proxy
 participant Sidecar as krypton-proxy
 participant Agent as User container
 participant Status as Agent status

 Client-&amp;gt;&amp;gt;Gateway: POST /v1/agents/agents/echo/foo
 Gateway-&amp;gt;&amp;gt;Cache: Resolve Agent and ready Endpoints
 Cache--&amp;gt;&amp;gt;Gateway: echo.agents.svc:8080
 Gateway-&amp;gt;&amp;gt;Gateway: Strip prefix, preserve traceparent, enable streaming flush
 Gateway-&amp;gt;&amp;gt;KubeProxy: Proxy /foo
 KubeProxy-&amp;gt;&amp;gt;Sidecar: Route to ready Endpoint
 Sidecar-&amp;gt;&amp;gt;Sidecar: Check shutdown and acquire concurrency slot
 alt capacity available
 Sidecar-&amp;gt;&amp;gt;Agent: Reverse proxy to 127.0.0.1:&amp;lt;spec.port&amp;gt;
 Agent--&amp;gt;&amp;gt;Sidecar: Streaming response
 Sidecar--&amp;gt;&amp;gt;Gateway: Release slot and forward response
 Gateway--&amp;gt;&amp;gt;Client: Response stream
 Gateway-&amp;gt;&amp;gt;Status: Patch lastInvocationAt asynchronously
 else concurrency cap reached
 Sidecar--&amp;gt;&amp;gt;Gateway: 503 with Retry-After
 Gateway--&amp;gt;&amp;gt;Client: 503 with Retry-After
 end&lt;/pre&gt;
&lt;p&gt;Typical latency: P50 ~50ms, P95 ~200ms for a 100ms user-handler.&lt;/p&gt;</description></item></channel></rss>