SDK Driver¶

The SDK driver is the tier that turns Leopold from "a better loop" into a real harness you brief and walk away from. It is an external Node process built on the Claude Agent SDK.

The core idea: persistent conductor, fresh workers¶

flowchart TB
    subgraph Driver["leopold-driver (one long-lived process)"]
        Cond["Conductor<br/>holds mission + charter + decisions<br/>for the whole run"]
    end
    Cond -->|"item 1"| W1["Worker 1<br/>fresh Claude Code"]
    Cond -->|"item 2"| W2["Worker 2<br/>fresh Claude Code"]
    Cond -->|"item 3"| W3["Worker 3<br/>fresh Claude Code"]
    W1 -.status.-> Cond
    W2 -.status.-> Cond
    W3 -.status.-> Cond

Each plan item gets a brand-new worker with clean context, so quality does not rot as the run grows. The conductor is persistent: it remembers the mission, charter, and every decision across the whole run. This is the best of both worlds — fresh context per task, plus a conductor holding the thread.

Modules¶

Module	Responsibility
`loop.ts`	the orchestration loop; burns down the plan (serial or `--parallel`), applies stop conditions
`worker.ts`	runs one item as a back-and-forth with a fresh worker
`conductor.ts`	reads a worker status, decides from the charter (structured verdict)
`review.ts`	the diverse-lens review panel (correctness / security / does-it-actually-work)
`hypotheses.ts`	the root-cause panel: disjoint-evidence investigators + refuters on a retry
`classify.ts`	deterministic per-item risk → effort / critical / sensitive
`route.ts`	opt-in smart routing: research the item's real blast radius (keyword fallback)
`learn.ts`	learn-on-finish: mine the run into proposed charter amendments
`compile.ts`	brief→workflow compiler: `PLAN.md` → dependency waves + classified `args`
`runtime.ts`	experimental in-driver workflow runtime (`agent`/`pipeline`/`parallel`/`budget`)
`workflow-cmd.ts`	the `leopold workflow` subcommand (emit / `--print` / `--run`)
`plan.ts`	`PLAN.md` as a dependency-aware work list
`worktree.ts` / `git.ts`	per-item worktree isolation + staged-patch replay
`channel.ts`	a driver-controlled async iterable feeding the worker session
`protocol.ts`	parses the worker's status block
`guard.ts`	the git lock as a `canUseTool` callback
`config.ts`	loads the brief and run config (CLI/env > GUARDRAILS > defaults)
`budget.ts` / `secrets.ts` / `reaper.ts`	USD hard-stop, encrypted vault, orphan-run reaper
`insights.ts`	`events.jsonl` → post-run report
`log.ts`	`DECISIONS.md`, `events.jsonl`, plan bookkeeping
`notify.ts`	completion / escalation notifications

Auth: your Claude Code, not an API key¶

Both the worker and the conductor run through the Agent SDK, which uses your existing Claude Code login. There is no separate API key and no split billing. ANTHROPIC_API_KEY is only needed in a headless environment with no Claude Code auth.

flowchart LR
    Driver["leopold-driver"] --> SDK["Claude Agent SDK"]
    SDK --> Auth["your Claude Code login<br/>(subscription)"]
    Auth --> Worker["worker"]
    Auth --> Conductor["conductor"]

Quality machinery around each item¶

An item doesn't just run — it is classified, conducted, and gated:

Classify (classify.ts, or route.ts with --smart-routing) sets the worker's reasoning effort and marks critical/sensitive items.
Conduct — the persistent conductor answers every worker status from the charter.
Review panel (review.ts) — independent skeptics with distinct lenses read the diff; blocking findings go back to the worker, unparseable verdicts fail closed.
On a retry (hypotheses.ts) — a root-cause panel forms hypotheses over disjoint evidence and hands the next attempt a concrete lead.
On a clean finish (learn.ts, opt-in) — the run is mined into proposed charter amendments; CHARTER.md itself is never edited.

The workflow path¶

leopold workflow compiles the same brief into a dynamic workflow: compile.ts turns PLAN.md into dependency waves with per-item classification (deterministic, unit-tested), emits .claude/workflows/leopold-run.js + .leopold/workflow-args.json, and --run executes it headlessly through runtime.ts — an experimental executor for the workflow globals with a real concurrency cap and the same git guard.

Status¶

Alpha. Verified: typechecks against @anthropic-ai/claude-agent-sdk; 113 unit tests (make driver-test) cover the status parser, the canUseTool guard (same bypass attempts as the bash red-team suite), classification, review-panel helpers, the hypothesis and learn parsers, the brief→workflow compiler, and the experimental runtime's orchestration; and a CLI smoke test (make driver-smoke) executes the built binary end to end against a fixture brief on every CI run (Ubuntu + macOS). The workflow --run query shim is the one path not exercised end to end — experimental by design.

See Driver Config to run it, and the Conductor & Worker Protocol for the exchange.