Overdrive — Redefining how compute runs

Platform Engineering

Get out from under the stack you maintain.

Stop running etcd, cert-manager, an ingress controller, a service mesh, and a CNI plugin as four independent failure domains. Overdrive ships them as one binary, with three-node HA that fits in 80 MB of memory.

See the architecture →

SRE & On-Call

Incidents that investigate themselves.

Every eBPF event carries cryptographic workload identity. The native SRE agent correlates across alerts via SQL joins, attaches signed BPF probes to verify hypotheses, and proposes typed remediations through a graduated approval gate.

See the SRE agent →

AI Engineering

Agents that can't exfiltrate what they don't have.

Prompt injection becomes structurally inert. The credential proxy holds the real keys. Domain allowlists run in-kernel via TC eBPF. BPF LSM blocks raw sockets. Security is enforced by infrastructure, not by the model's judgment.

See agent isolation →

Why now

Kubernetes was right for 2014.
It is not right for 2026.

Stable eBPF APIs, kernel TLS offload, production Rust systems libraries, and embeddable WASM only matured in the last two years. Overdrive is the orchestrator that becomes possible when all four exist at once.

01 / Own your primitives

Every dependency is a future incident.

No etcd. No Envoy. No SPIRE. No CNI. Every critical subsystem is built into the platform or is a standard Rust library. External processes you didn't write are operational liabilities — they get cut.

02 / The kernel is the dataplane

Userspace proxies become unnecessary.

Service routing, network policy, load balancing, mTLS, and telemetry happen at line rate in the kernel via aya-rs. No sidecar tax. No proxy reconfigurations. No tail-latency spikes from a userspace hop.

03 / Security is structural

mTLS isn't an option you remember to enable.

Every connection is wrapped in kTLS with a SPIFFE identity. Policy is enforced in-kernel by BPF LSM. A compromised workload, a misconfigured pod, and a malicious dependency all hit the same walls.

By the numbers

Architecture decisions, measured at fleet scale.

Not micro-optimizations. These are direct consequences of the design — the kind of differences that turn three racks back into one.

~10×

Less control-plane RAM

~100 MB vs ~1 GB on Kubernetes

~100×

Less mTLS CPU overhead

kTLS in-kernel vs Envoy sidecar

~50×

Faster scheduling

< 100 ms vs 1–10 s on Kubernetes

2.3×

Workload density

~70% utilization vs ~30% baseline

<10s

Node join

vs 2–5 minutes on Kubernetes

~1ms

WASM cold start

Wasmtime warm pool, no Firecracker tax

Architecture

One binary. Any topology.

Control plane and node agent ship in a single Rust binary. Role is declared at bootstrap — single node, three-node HA, dedicated ingress tier, multi-region. No second installer. No second upgrade path.

┌────────────────────────────────────────────────────────────────────┐
│                          CLI / API  (gRPC + REST, tonic)         │
├────────────────────────────────────────────────────────────────────┤
│     CONTROL PLANE  — co-located with node agent or dedicated     │
│                                                                    │
│   IntentStore        Reconcilers        Built-in CA  (SPIFFE)      │
│   single: redb       Rust traits        Scheduler                  │
│   ha: openraft+redb  WASM extensions    Regorus + WASM policy      │
│   per region                            DuckLake telemetry         │
├────────────────────────────────────────────────────────────────────┤
│     NODE AGENT                                                     │
│                                                                    │
│   ▸ aya-rs eBPF dataplane                                          │
│     XDP (routing/LB) · TC (egress) · sockops (mTLS)                │
│     BPF LSM (MAC) · kprobes (telemetry)                            │
│                                                                    │
│   ▸ Drivers:  process · microvm · vm · unikernel · wasm            │
│   ▸ Gateway:  hyper · rustls · ACMEv2 · in-process route table     │
├────────────────────────────────────────────────────────────────────┤
│   ObservationStore  (Corrosion — CR-SQLite + SWIM/QUIC gossip)   │
│   alloc status · service backends · node health · regional peers   │
├────────────────────────────────────────────────────────────────────┤
│             Object Storage  (Garage, S3-compatible)                │
└────────────────────────────────────────────────────────────────────┘

Workload model

Five drivers. One identity. One policy.

VMs, processes, unikernels, containers, and WASM functions share the same SPIFFE identity, the same eBPF dataplane, the same policy engine. Pick the right primitive for the workload — not the only one your platform happens to support.

process

Native binaries

Daemons under cgroups v2, kernel-enforced isolation, zero VM overhead.

tokio::process

microvm

Fast-boot VMs

~200 ms cold start. Hardware isolation. Optional persistent rootfs.

cloud-hypervisor

Full virtualisation

Live CPU and memory hotplug. virtiofs sharing. AArch64 first-class.

cloud-hypervisor

unikernel

Extreme density

Single-purpose images on a hypervisor. Minimal kernel surface.

unikraft + CH

wasm

Serverless functions

~1 ms cold start. Scale-to-zero. Sandbox the model can't talk its way out of.

wasmtime

Job spec

Declare what. The platform handles how.

A single TOML composes drivers, sidecars, security profiles, and ingress. The same spec deploys to a single node and to a multi-region fleet — the platform absorbs the difference.

job.toml · ai-research-agent

# An AI agent with structural credential isolation
# and prompt-injection scanning on ingress.

[job]
name   = "ai-research-agent"
driver = "wasm"

[[job.sidecars]]
name    = "credential-proxy"
module  = "builtin:credential-proxy"
hooks   = ["egress"]

  [job.sidecars.config]
  allowed_domains = ["api.anthropic.com"]
  credentials     = { ANTHROPIC_KEY = { secret = "prod" } }

[[job.sidecars]]
name   = "content-inspector"
module = "builtin:content-inspector"
hooks  = ["ingress"]

[job.security]
no_raw_sockets          = true
no_privilege_escalation = true
egress.mode             = "intercepted"

~/$ overdrive

$ overdrive job submit job.toml
→ ai-research-agent · scheduled · alloc a1b2c3

$ overdrive alloc status a1b2c3
  state          : running
  driver         : wasm
  node           : node-04 (eu-west-1)
  identity       : spiffe://overdrive.local/
                 :   job/ai-research-agent
                 :   alloc/a1b2c3
  cert ttl       : 58m 12s
  sidecars       : 2 attached, healthy

$ overdrive cluster upgrade \
    --mode ha \
    --peers node-2,node-3
→ snapshot exported (LocalStore)
→ RaftStore bootstrapped on 3 peers
→ leader: node-1 · zero downtime

Native SRE Agent

An LLM that can reason about your cluster.

Every event in Overdrive carries cryptographic SPIFFE identity. Correlation isn't a label-matching heuristic — it's a SQL join. Investigations are first-class resources with a budget, a transcript, and a typed proposal at the end.

Investigations as a resource. Lifecycle, budget, transcript — compressed into incident memory on conclusion.
Typed remediation. Tier 0 reads auto-execute. Tier 2 writes wait for human ratification.
Hypothesis verification. Attach signed BPF probes for one investigation turn. No instrumentation rollout.
Deterministic replay. Investigation transcripts re-run in CI under the simulation harness.

Read § 12 of the whitepaper →

investigation_idinv-7f3a92

triggeralert · payments.p99 > 800ms

scopejob/payments · eu-west-1

tools_called7

llm_turns4

tokens_spent12,840 / 50,000

probe_attachedtcp_retransmit_trace

diagnosisbackend pool exhausted

proposedScaleJob 3→6 · Tier 1 · auto

statusconcluded · 47s

Multi-region by default

See your fleet as one cluster.

Per-region Raft for intent. Global CRDT gossip for observation. Each region keeps writing through a partition. The dataplane never reads remote state.

us-east-1 healthy

nodes 42 / 42

allocations 2,184

raft leader node-04

gossip lag p99 214 ms

egress mTLS 100%

eu-west-1 healthy

nodes 38 / 38

allocations 1,902

raft leader node-12

gossip lag p99 187 ms

egress mTLS 100%

ap-southeast-1 healthy

nodes 21 / 21

allocations 964

raft leader node-03

gossip lag p99 412 ms

egress mTLS 100%

3 regions · 101 nodes · 5,050 allocations last gossip tick · 2.1s ago

Compare

Coherent by construction, not by configuration.

When the dataplane, the identity model, the telemetry pipeline, and the service mesh all emerge from the same kernel primitive with the same workload identity attached, you stop gluing together products that were never designed to know about each other.

Component	Kubernetes	Overdrive
Service routing	iptables · O(n) per packet	XDP BPF · O(1) in-kernel
mTLS	Envoy sidecar · ~0.5 vCPU each	kTLS · NIC offload ~0 overhead
Control-plane RAM	~1 GB	~30–80 MB
Network policy	Per-packet iptables walk	BPF map lookup
Workload types	Containers	Process · microVM · VM · unikernel · WASM
Observability	Scraped logs & Prometheus	Kernel-native · identity-tagged
Multi-region	Stretched Raft or federation plane	Per-region Raft + global CRDT gossip
Extension model	Go operators with cluster-admin	WASM · sandboxed · hot-reloadable
Node join	2–5 minutes	< 10 seconds

Cloud platform

Run your own. Or let us run it.

The source-available core is FSL-1.1-ALv2 and ships in one binary. Every release converts to Apache 2.0 two years after publication. The cloud platform sells the operational complexity we already absorbed for ourselves — metered exactly, by kernel telemetry. No estimation. No sampling.

Tier 1

Managed Overdrive

A full Overdrive cluster as a service. Control plane, worker pool, and CLI — the way you'd run it yourself, only without running it.

per vCPU-hour + GB-hour metered at allocation level

Tier 2 · Most flexible

Managed Workloads

Submit jobs directly. No cluster to size. No control plane to patch. Nomad-style ergonomics with Overdrive's identity and isolation model underneath.

per vCPU-hour + GB-hour tenant namespace, kernel-isolated

Tier 3

Serverless WASM

Sub-10 ms cold start. Scale-to-zero. The credential proxy and content inspector ship by default — built for the AI agent workloads no one else has a story for.

per invocation + GB-second minimum unit: one invocation

Tier 4

Bare Metal Dedicated

Dedicated physical nodes inside the platform. Full hardware performance. Full Overdrive operational stack. No VM overhead between you and the silicon.

per node-hour reserved capacity discounts

Enterprise self-hosted licensing — FIPS crypto, HSM integration, air-gap tooling, DORA / NIS2 / SOC2 / HIPAA policy packs — available alongside the source-available release.

Redefining how compute runs.

Get out from under the stack you maintain.

Incidents that investigate themselves.

Agents that can't exfiltrate what they don't have.

Kubernetes was right for 2014.
It is not right for 2026.

Every dependency is a future incident.

Userspace proxies become unnecessary.

mTLS isn't an option you remember to enable.

Architecture decisions, measured at fleet scale.

One binary. Any topology.

Five drivers. One identity. One policy.

Native binaries

Fast-boot VMs

Full virtualisation

Extreme density

Serverless functions

Declare what. The platform handles how.

An LLM that can reason about your cluster.

See your fleet as one cluster.

Coherent by construction, not by configuration.

Run your own. Or let us run it.

Managed Overdrive

Managed Workloads

Serverless WASM

Bare Metal Dedicated

Reimagining the foundation of modern infrastructure.

Redefining how compute runs.

Get out from under the stack you maintain.

Incidents that investigate themselves.

Agents that can't exfiltrate what they don't have.

Kubernetes was right for 2014. It is not right for 2026.

Every dependency is a future incident.

Userspace proxies become unnecessary.

mTLS isn't an option you remember to enable.

Architecture decisions, measured at fleet scale.

One binary. Any topology.

Five drivers. One identity. One policy.

Native binaries

Fast-boot VMs

Full virtualisation

Extreme density

Serverless functions

Declare what. The platform handles how.

An LLM that can reason about your cluster.

See your fleet as one cluster.

Coherent by construction, not by configuration.

Run your own. Or let us run it.

Managed Overdrive

Managed Workloads

Serverless WASM

Bare Metal Dedicated

Reimagining the foundation of modern infrastructure.

Kubernetes was right for 2014.
It is not right for 2026.