Claude Code Source Leak: What the Code Reveals

1,778 words → 707 · 6 min saved

Technical breakdown of Claude Code's leaked source: anti-distillation via fake tool injection, native API attestation in Zig, frustration detection by regex, 250K wasted API calls/day from a 3-line bug, and KAIROS autonomous agent scaffolding.

Context

  • Anthropic shipped a .map file with the Claude Code npm package containing full readable source code. Package was pulled but code was widely mirrored.
  • Second accidental exposure in a week (model spec leak days prior).
  • Ten days earlier, Anthropic sent legal threats to OpenCode for using Claude Code's internal APIs to access Opus at subscription rates instead of pay-per-token pricing.

Anti-Distillation Mechanisms

Mechanism How It Works Bypass
Fake tools injection (ANTI_DISTILLATION_CC flag) Server silently injects decoy tool definitions into the system prompt when anti_distillation: ['fake_tools'] sent in API requests. Gated behind GrowthBook flag tengu_anti_distill_fake_tool_injection, only active for first-party CLI sessions. MITM proxy strips anti_distillation field from request bodies; set CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS env var; use third-party API provider or SDK entrypoint instead of CLI.
Connector-text summarization API buffers assistant text between tool calls, summarizes it, returns summary with cryptographic signature. Original text restored from signature on subsequent turns. Recorded API traffic only captures summaries, not full reasoning chains. Anthropic-internal only (USER_TYPE === 'ant'). External users never encounter it.

Undercover Mode

  • undercover.ts (~90 lines) strips all Anthropic internals (codenames like 'Capybara'/'Tengu', Slack channels, repo names, 'Claude Code' itself) when used in non-internal repos.
  • No force-OFF exists: 'There is NO force-OFF. This guards against model codename leaks.' Can force ON with CLAUDE_CODE_UNDERCOVER=1. In external builds, dead-code-eliminated to trivial returns.
  • AI-authored commits and PRs from Anthropic employees in open source projects will have no indication AI wrote them.

Frustration Detection via Regex

/\b(wtf|wth|ffs|omfg|shit(ty|tiest)?|dumbass|horrible|awful|
piss(ed|ing)? off|piece of (shit|crap|junk)|what the (fuck|hell)|
fucking? (broken|useless|terrible|awful|horrible)|fuck you|
screw (this|you)|so frustrating|this sucks|damn it)\b/

userPromptKeywords.ts detects user frustration with regex. A regex is faster and cheaper than an LLM inference call just to check if someone is swearing at the tool.

Native Client Attestation

  • API requests include a cch=00000 placeholder in x-anthropic-billing-header. Before the request leaves the process, Bun's native Zig HTTP stack overwrites the zeros with a computed hash. Same-length placeholder avoids changing Content-Length or requiring buffer reallocation.
  • Computation happens below the JS runtime, invisible to JS layer. Server validates the hash to confirm the request came from a real Claude Code binary.
  • This is the technical enforcement behind the OpenCode legal fight: the binary cryptographically proves it's the real client.
  • Not airtight: gated behind NATIVE_CLIENT_ATTESTATION compile-time flag; disabled via CLAUDE_CODE_ATTRIBUTION_HEADER env var or GrowthBook killswitch (tengu_attribution_header). Running on stock Bun or Node sends literal zeros. Server-side validation may be forgiving (code references tolerance for 'unknown extra fields').

KAIROS: Unreleased Autonomous Agent Mode

Component Details
/dream skill Nightly memory distillation
Daily logs Append-only
GitHub webhooks Webhook subscriptions
Background daemon Worker processes
Cron refresh Scheduled every 5 minutes
Status Heavily feature-gated; scaffolding for always-on background agent exists but unclear how far along

Other Findings

Finding Details
April Fools companion buddy/companion.ts: Tamagotchi-style system. 18 species, rarity tiers (common to legendary), 1% shiny chance, RPG stats (DEBUGGING, SNARK). Deterministic from user ID via Mulberry32 PRNG. Species names encoded with String.fromCharCode() to dodge grep.
Terminal rendering Int32Array-backed ASCII char pool, bitmask-encoded style metadata, patch optimizer merging cursor moves and canceling hide/show pairs, self-evicting line-width cache ('~50x reduction in stringWidth calls during token streaming').
Bash security 23 numbered checks in bashSecurity.ts: 18 blocked Zsh builtins, defense against Zsh equals expansion (=curl bypassing permission checks), unicode zero-width space injection, IFS null-byte injection, malformed token bypass found during HackerOne review.
Prompt cache economics promptCacheBreakDetection.ts tracks 14 cache-break vectors. 'Sticky latches' prevent mode toggles from busting the cache. One function annotated DANGEROUS_uncachedSystemPromptSection().
Multi-agent coordinator coordinatorMode.ts: orchestration algorithm is a prompt, not code. Instructions include 'Do not rubber-stamp weak work' and 'You must understand findings before directing follow-up work.'
Code quality print.ts: 5,594 lines, single 3,167-line function, 12 nesting levels. Uses Axios (recently compromised on npm with RAT-dropping malicious versions).
250K wasted API calls/day 1,279 sessions had 50+ consecutive autocompact failures (up to 3,272). Fix: MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3. Three lines of code.

Why This Leak Matters

Google's Gemini CLI and OpenAI's Codex are open source, but those are agent SDKs (toolkits), not the full internal wiring of a flagship product

The real damage is feature flags (KAIROS, anti-distillation): product roadmap details competitors can now see and react to

Code can be refactored, but strategic surprise can't be un-leaked

Anthropic acquired Bun late last year; a Bun bug (oven-sh/bun#28001, filed March 11) serves source maps in production despite docs saying disabled. If this caused the leak, Anthropic's own toolchain shipped a known bug that exposed their own source code.

Read original · ← Archive