Diving into Claude Code's Source Code Leak

2,673 words → 1224 · 8 min saved

What 600k lines of leaked Claude Code source reveal: an unreleased autonomous agent (KAIROS) that runs 24/7 with its own memory consolidation, anti-distillation mechanisms to poison model copycats, DRM-level API attestation in Zig, a 3-layer memory architecture, and patterns any engineer building agents should adopt.

How It Happened

  1. Anthropic accidentally included a .map sourcemap file in a Claude Code npm package on March 31, 2026
  2. Boris Cherny (Claude Code engineer) confirmed it was plain developer error, not a tooling bug: "Mistakes happen. As a team, the important thing is to recognize it's never an individual's fault. It's the process, the culture, or the infra."
  3. Chaofan Shou (@Fried_rice) was first to notice and posted a public link. Within minutes the 600k lines of code were mirrored, analyzed, ported to Python, and uploaded to decentralized servers

Chaos and Legality

  • Most popular fork: claw-code on GitHub by @realsigridjin, ported to Python from scratch using OpenAI's Codex to prevent legal issues. Currently 75,000+ stars and 75,000+ forks
  • Unresolved legal question: does a codegen clean-room rebuild violate copyright? Traditional clean-room builds require two separate teams and take months. Now anyone with a Claude Max plan can have an agent rebuild the logic overnight. This has never been challenged in court
  • Gergely Orosz's point: even if Anthropic asserts copyright, do they want the PR battle of suing an open source project for rebuilding their own AI-written product?
  • 4nzn uploaded a stripped version to IPFS with telemetry removed, security guardrails removed, and experimental features unlocked. Whether DMCA can reach IPFS content is itself unresolved
  • Non-rewritten forks have been DMCA'd by Anthropic. Claw-code (the rewrite) is still up

KAIROS: Autonomous Agent Mode

Component Detail
Trigger Runs 24/7 in background. Receives heartbeat prompt every few seconds: "anything worth doing right now?"
Capabilities Fix errors, respond to messages, update files, run tasks — everything Claude Code can do, without user initiation
Exclusive tool: Push notifications Reaches you on phone/desktop even when terminal is closed
Exclusive tool: File delivery Sends files it created without being asked
Exclusive tool: PR subscriptions Watches GitHub and reacts to code changes on its own
Logging Append-only daily logs of everything noticed, decided, and done. Cannot erase its own history
autoDream Nightly process that consolidates what it learned during the day and reorganizes memory. Persists across sessions
Architectural insight Separation of initiative from execution. Regular Claude Code is reactive. KAIROS introduces a proactive loop requiring a fundamentally different trust model — the quality of the agent's judgment about what is worth doing becomes the central problem

Additional Hidden Features

  • 44 hidden feature flags and 20+ unshipped features: background agents running 24/7, one Claude orchestrating multiple worker Claudes, cron scheduling, full voice command mode, browser control via Playwright, and agents that sleep and self-resume

Anti-Distillation: Poisoning Copycats

graph TD
 A["Distillation: training a smaller model\nto mimic a larger 'teacher' model"] --> B["Attack surface: competitors routing\nrequests through Claude Code\nto collect training data"]
 B --> C["Layer 1: anti_distillation flag\ninjects decoy tool definitions\ninto system prompts"]
 B --> D["Layer 2: CONNECTOR_TEXT\nserver-side buffer summarizes\nassistant text between tool calls\nwith cryptographic signatures"]
 C --> E["Poisoned tool schemas baked\ninto every collected prompt"]
 D --> F["API traffic recorders only\nget summaries, not full\nreasoning chains"]
 E --> G["Models trained on this data\nbecome less reliable"]
 H["Estimated bypass: ~1 hour\nvia MITM proxy or\nCLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS\nenv variable"] --> I["Stronger deterrent is likely\nlegal rather than technical"]

Model Codenames

Codename Details
Capybara (aka Mythos) Already on version 8. 1M context, fast mode. Code notes issues with over-commenting and false claims
Numbat Upcoming model with launch window baked into source. Tagged: "@[MODEL LAUNCH]: Remove this section when we launch numbat"
Fennec Speculated by multiple researchers to be Opus 4.6
Tengu Referenced in undercover mode, which strips internal codenames from external builds
Undercover mode undercover.ts (90 lines). One-way door: no force-OFF. Suppresses all internal codenames, Slack channels, even the name "Claude Code" in external repos. Side effect: Anthropic employees contributing to open source were not disclosing AI authorship, and the tool was built to ensure that

DRM Below the JavaScript Layer

API requests contain placeholder values (cch=ed1b0) that Bun's native HTTP stack (written in Zig) overwrites with computed hashes before transmission

This cryptographically proves a request originated from a genuine Claude Code binary

JavaScript can be patched or proxied at runtime; Zig compiled into Bun binary cannot be inspected without recompiling from source

Third-party clients like OpenCode were likely being blocked at the API level, not just via legal notices

Mechanism is gated behind a compile-time flag, so may not be active everywhere

The Memory Architecture

mindmap
  root("Claude Code Memory: 3-Layer Index")
    Index
      Always loaded
      Just pointers ~150 chars per line
    Topic Files
      Loaded on demand
      Actual knowledge
    Transcripts
      Never loaded into context
      Only grep'd
    Write Discipline
      Write to topic file first then update index
      Never dump content into index
      If derivable from codebase do not store
    autoDream
      Consolidates deduplicates removes contradictions
      Runs in forked subagent with limited tool access
      Prevents corrupting main context
    Design Insight
      Context window treated as scarce resource
      What they choose NOT to store matters as much as what they do

Magic Docs: Self-Updating Documentation

  • Files with a MAGIC DOC header trigger a dedicated subagent when Claude Code is idle
  • Subagent reads the file, updates documentation for the specified feature, writes it back
  • Key design choice: subagent restricted to editing that single file, preventing drift into unrelated changes

The Harness Architecture

Feature Detail
Repo context Live git branch, recent commits, CLAUDE.md files reread every query
Prompt caching Stable/dynamic prompt boundary via SYSTEM_PROMPT_DYNAMIC_BOUNDARY. Static front half cached across sessions. DANGEROUS_uncachedSystemPromptSection marks cache-breaking sections
Search tools Dedicated Grep and Glob tools instead of shell commands for better-structured results
LSP access Call hierarchies, symbol definitions, references
Context compaction 5 distinct strategies for handling context overflow
Hook system 25+ event hooks to intercept and modify behavior at every execution stage
Subagent models Fork, teammate, worktree. Forked subagents inherit parent context as byte-identical copies — spawning 5 agents costs barely more than 1

Easter Eggs

Item Detail
buddy/companion.ts April Fools feature: deterministic creature per user. 18 species, rarity tiers, 1% shiny rate, RPG stats encoded to evade grep checks
Spinner verbs Exactly 187
print.ts 5,594 lines with a single function that is 3,167 lines long and 12 nesting levels deep
Comments LLM-oriented, written for AI agents working on the codebase rather than human readers

March 2026 Security Incidents

Incident Impact
Axios (100M weekly npm downloads) Maintainer account hijacked, RAT deployed across macOS/Windows/Linux. Malware self-destructs after execution. Google suspects North Korean actors
LiteLLM (97M monthly PyPI installs) Three-stage backdoor: credential harvester (SSH keys, AWS/GCP/Azure creds, K8s configs, crypto wallets, LLM API keys), Kubernetes lateral movement, persistent systemd backdoor. Live 3 hours before quarantine
Railway (2M users, 31% Fortune 500) CDN misconfiguration leaked authenticated user data to wrong users for 52 minutes
Delve (YC-backed) Allegedly generating fraudulent SOC 2 audit reports: identical boilerplate across 494 reports, fabricated board meeting evidence, audit conclusions written before client evidence submitted
Mercor AI Alleged LAPSUS$ breach: 939GB source code, 4TB total data via TailScale VPN
OpenAI Codex Command injection via branch names. Discovered Dec 2025, patched Feb 2026, disclosed Mar 2026. Could steal GitHub auth tokens
GitHub Copilot Injected promotional ads into 1.5M+ PRs as hidden HTML comments. GitHub VP confirmed it was "the wrong judgement call"

Takeaways for Engineers Building Agents

Applicable patterns from the source
☐ Split system prompt at a stable boundary: static instructions that never change between sessions go before SYSTEM_PROMPT_DYNAMIC_BOUNDARY, dynamic context goes after. Makes prompt cache work so you don't recompute the same tokens every turn
☐ Magic Docs pattern: self-updating documentation via idle-triggered subagents scoped to editing a single file
☐ Write LLM-oriented comments: detailed comments with decisions + reasoning. Too verbose for human-only codebases, but useful for agent-assisted ones
☐ Permission classification via side-query (critic pattern): send commands to the model as a separate query ("Is this command safe?") evaluated against context, working directory, and user intent. Replaces brittle allowlists with adaptive, context-aware security
Read original · ← Archive