Whitepaper v0.12 Now public — read the design

Redefining how compute runs.

Orchestration infrastructure for a new generation of workloads.

~50 MB node image ~30 MB single-mode control plane < 10s node join FSL-1.1-ALv2
Platform Engineering

Get out from under the stack you maintain.

Stop running etcd, cert-manager, an ingress controller, a service mesh, and a CNI plugin as four independent failure domains. Overdrive ships them as one binary, with three-node HA that fits in 80 MB of memory.

See the architecture
SRE & On-Call

Incidents that investigate themselves.

Every eBPF event carries cryptographic workload identity. The native SRE agent correlates across alerts via SQL joins, attaches signed BPF probes to verify hypotheses, and proposes typed remediations through a graduated approval gate.

See the SRE agent
AI Engineering

Agents that can't exfiltrate what they don't have.

Prompt injection becomes structurally inert. The credential proxy holds the real keys. Domain allowlists run in-kernel via TC eBPF. BPF LSM blocks raw sockets. Security is enforced by infrastructure, not by the model's judgment.

See agent isolation

Standing on production-grade Rust primitives

aya-rs openraft wasmtime cloud-hypervisor rustls + kTLS regorus corrosion / cr-sqlite redb
Why now

Kubernetes was right for 2014.
It is not right for 2026.

Stable eBPF APIs, kernel TLS offload, production Rust systems libraries, and embeddable WASM only matured in the last two years. Overdrive is the orchestrator that becomes possible when all four exist at once.

01 / Own your primitives

Every dependency is a future incident.

No etcd. No Envoy. No SPIRE. No CNI. Every critical subsystem is built into the platform or is a standard Rust library. External processes you didn't write are operational liabilities — they get cut.

02 / The kernel is the dataplane

Userspace proxies become unnecessary.

Service routing, network policy, load balancing, mTLS, and telemetry happen at line rate in the kernel via aya-rs. No sidecar tax. No proxy reconfigurations. No tail-latency spikes from a userspace hop.

03 / Security is structural

mTLS isn't an option you remember to enable.

Every connection is wrapped in kTLS with a SPIFFE identity. Policy is enforced in-kernel by BPF LSM. A compromised workload, a misconfigured pod, and a malicious dependency all hit the same walls.

By the numbers

Architecture decisions, measured at fleet scale.

Not micro-optimizations. These are direct consequences of the design — the kind of differences that turn three racks back into one.

~10×
Less control-plane RAM
~100 MB vs ~1 GB on Kubernetes
~100×
Less mTLS CPU overhead
kTLS in-kernel vs Envoy sidecar
~50×
Faster scheduling
< 100 ms vs 1–10 s on Kubernetes
2.3×
Workload density
~70% utilization vs ~30% baseline
<10s
Node join
vs 2–5 minutes on Kubernetes
~1ms
WASM cold start
Wasmtime warm pool, no Firecracker tax
Architecture

One binary. Any topology.

Control plane and node agent ship in a single Rust binary. Role is declared at bootstrap — single node, three-node HA, dedicated ingress tier, multi-region. No second installer. No second upgrade path.

┌────────────────────────────────────────────────────────────────────┐
                          CLI / API  (gRPC + REST, tonic)         
├────────────────────────────────────────────────────────────────────┤
     CONTROL PLANE  — co-located with node agent or dedicated     
                                                                    
   IntentStore        Reconcilers        Built-in CA  (SPIFFE)      
   single: redb       Rust traits        Scheduler                  
   ha: openraft+redb  WASM extensions    Regorus + WASM policy      
   per region                            DuckLake telemetry         
├────────────────────────────────────────────────────────────────────┤
     NODE AGENT                                                     
                                                                    
   ▸ aya-rs eBPF dataplane                                          
     XDP (routing/LB) · TC (egress) · sockops (mTLS)                
     BPF LSM (MAC) · kprobes (telemetry)                            
                                                                    
   ▸ Drivers:  process · microvm · vm · unikernel · wasm            
   ▸ Gateway:  hyper · rustls · ACMEv2 · in-process route table     
├────────────────────────────────────────────────────────────────────┤
   ObservationStore  (Corrosion — CR-SQLite + SWIM/QUIC gossip)   
   alloc status · service backends · node health · regional peers   
├────────────────────────────────────────────────────────────────────┤
             Object Storage  (Garage, S3-compatible)                
└────────────────────────────────────────────────────────────────────┘
Workload model

Five drivers. One identity. One policy.

VMs, processes, unikernels, containers, and WASM functions share the same SPIFFE identity, the same eBPF dataplane, the same policy engine. Pick the right primitive for the workload — not the only one your platform happens to support.

process

Native binaries

Daemons under cgroups v2, kernel-enforced isolation, zero VM overhead.

tokio::process
microvm

Fast-boot VMs

~200 ms cold start. Hardware isolation. Optional persistent rootfs.

cloud-hypervisor
vm

Full virtualisation

Live CPU and memory hotplug. virtiofs sharing. AArch64 first-class.

cloud-hypervisor
unikernel

Extreme density

Single-purpose images on a hypervisor. Minimal kernel surface.

unikraft + CH
wasm

Serverless functions

~1 ms cold start. Scale-to-zero. Sandbox the model can't talk its way out of.

wasmtime
Job spec

Declare what. The platform handles how.

A single TOML composes drivers, sidecars, security profiles, and ingress. The same spec deploys to a single node and to a multi-region fleet — the platform absorbs the difference.

job.toml · ai-research-agent
# An AI agent with structural credential isolation
# and prompt-injection scanning on ingress.

[job]
name   = "ai-research-agent"
driver = "wasm"

[[job.sidecars]]
name    = "credential-proxy"
module  = "builtin:credential-proxy"
hooks   = ["egress"]

  [job.sidecars.config]
  allowed_domains = ["api.anthropic.com"]
  credentials     = { ANTHROPIC_KEY = { secret = "prod" } }

[[job.sidecars]]
name   = "content-inspector"
module = "builtin:content-inspector"
hooks  = ["ingress"]

[job.security]
no_raw_sockets          = true
no_privilege_escalation = true
egress.mode             = "intercepted"
~/$ overdrive
$ overdrive job submit job.toml
→ ai-research-agent · scheduled · alloc a1b2c3

$ overdrive alloc status a1b2c3
  state          : running
  driver         : wasm
  node           : node-04 (eu-west-1)
  identity       : spiffe://overdrive.local/
                 :   job/ai-research-agent
                 :   alloc/a1b2c3
  cert ttl       : 58m 12s
  sidecars       : 2 attached, healthy

$ overdrive cluster upgrade \
    --mode ha \
    --peers node-2,node-3
→ snapshot exported (LocalStore)
→ RaftStore bootstrapped on 3 peers
→ leader: node-1 · zero downtime
Native SRE Agent

An LLM that can reason about your cluster.

Every event in Overdrive carries cryptographic SPIFFE identity. Correlation isn't a label-matching heuristic — it's a SQL join. Investigations are first-class resources with a budget, a transcript, and a typed proposal at the end.

  • Investigations as a resource. Lifecycle, budget, transcript — compressed into incident memory on conclusion.
  • Typed remediation. Tier 0 reads auto-execute. Tier 2 writes wait for human ratification.
  • Hypothesis verification. Attach signed BPF probes for one investigation turn. No instrumentation rollout.
  • Deterministic replay. Investigation transcripts re-run in CI under the simulation harness.
Read § 12 of the whitepaper
investigation_idinv-7f3a92
triggeralert · payments.p99 > 800ms
scopejob/payments · eu-west-1
tools_called7
llm_turns4
tokens_spent12,840 / 50,000
probe_attachedtcp_retransmit_trace
diagnosisbackend pool exhausted
proposedScaleJob 3→6 · Tier 1 · auto
statusconcluded · 47s
Multi-region by default

See your fleet as one cluster.

Per-region Raft for intent. Global CRDT gossip for observation. Each region keeps writing through a partition. The dataplane never reads remote state.

us-east-1 healthy
nodes 42 / 42
allocations 2,184
raft leader node-04
gossip lag p99 214 ms
egress mTLS 100%
eu-west-1 healthy
nodes 38 / 38
allocations 1,902
raft leader node-12
gossip lag p99 187 ms
egress mTLS 100%
ap-southeast-1 healthy
nodes 21 / 21
allocations 964
raft leader node-03
gossip lag p99 412 ms
egress mTLS 100%
3 regions · 101 nodes · 5,050 allocations last gossip tick · 2.1s ago
Compare

Coherent by construction, not by configuration.

When the dataplane, the identity model, the telemetry pipeline, and the service mesh all emerge from the same kernel primitive with the same workload identity attached, you stop gluing together products that were never designed to know about each other.

Component Kubernetes Overdrive
Service routing iptables · O(n) per packet XDP BPF · O(1) in-kernel
mTLS Envoy sidecar · ~0.5 vCPU each kTLS · NIC offload ~0 overhead
Control-plane RAM ~1 GB ~30–80 MB
Network policy Per-packet iptables walk BPF map lookup
Workload types Containers Process · microVM · VM · unikernel · WASM
Observability Scraped logs & Prometheus Kernel-native · identity-tagged
Multi-region Stretched Raft or federation plane Per-region Raft + global CRDT gossip
Extension model Go operators with cluster-admin WASM · sandboxed · hot-reloadable
Node join 2–5 minutes < 10 seconds
Cloud platform

Run your own. Or let us run it.

The source-available core is FSL-1.1-ALv2 and ships in one binary. Every release converts to Apache 2.0 two years after publication. The cloud platform sells the operational complexity we already absorbed for ourselves — metered exactly, by kernel telemetry. No estimation. No sampling.

Tier 1

Managed Overdrive

A full Overdrive cluster as a service. Control plane, worker pool, and CLI — the way you'd run it yourself, only without running it.

per vCPU-hour + GB-hour metered at allocation level
Tier 3

Serverless WASM

Sub-10 ms cold start. Scale-to-zero. The credential proxy and content inspector ship by default — built for the AI agent workloads no one else has a story for.

per invocation + GB-second minimum unit: one invocation
Tier 4

Bare Metal Dedicated

Dedicated physical nodes inside the platform. Full hardware performance. Full Overdrive operational stack. No VM overhead between you and the silicon.

per node-hour reserved capacity discounts

Enterprise self-hosted licensing — FIPS crypto, HSM integration, air-gap tooling, DORA / NIS2 / SOC2 / HIPAA policy packs — available alongside the source-available release.

Source-available · FSL-1.1-ALv2 · Apache 2.0 after 2 years

Reimagining the foundation of modern infrastructure.

One binary. Every workload type. Built for the kernel of 2026, not the cluster of 2014.