
Anthropic accidentally shipped their entire source code.
We read all 512,000 lines.
Hidden virtual pets, an "undercover mode" that hides AI identity in open-source commits, internal model codenames, a dream system that organizes memories while you sleep — and 43 tools the public was never meant to see.
"Claude code source code has been leaked via a map file in their npm registry!"
View original tweet →"Claude Code just got open sourced again!"
View tweet →Found 187 spinner verbs hidden in the codebase
View tweet →Coverage from one of the largest software engineering newsletters
View tweet →From discovery to DMCA — everything happened in less than 12 hours.
Security researcher Chaofan Shou (@Fried_rice) discovers Claude Code's entire source via npm source map. Tweet gets 4.5M+ views.
Theo (t3.gg), Wes Bos, Pragmatic Engineer, and others share the news. The developer community explodes.
Multiple repos mirror the leaked source. hangsman, kuberwastaken, mehmoodosman among the first.
The fastest-growing mirrors hit 22,000+ GitHub stars. HN and Reddit threads explode.
Blog posts, dev.to articles, and deep-dives start appearing. Hidden features discovered.
DMCA takedown notices begin hitting mirrors. Some repos go private or get removed.
Claw Decode publishes the complete breakdown: 43 tools, 7 hidden features, full system prompt reconstruction.
18 species, 5 rarity tiers, animated ASCII sprites, hats, and personality stats. All gated behind the BUDDY feature flag.
Why this matters: This isn't a joke feature. Each companion has CompanionBones (species, eye, hat, shiny, stats) derived from hash(userId), and a CompanionSoul (name, personality) generated by the model on first hatch. Species names are encoded via String.fromCharCode() because "Capybara" collides with an internal model codename scanner.
n______n ( o o ) ( oo ) `------´
n ____ n | |o o| | |_| |_| | |
.[||]. [ o o ] [ ==== ] `------´
.----. / o o \ | | ~`~``~`~
}~(______)~{
}~(o .. o)~{
( .--. )
(_/ \_) .---. (o>o) /( )\ `---´
_,--._ ( o o ) /[______]\ `` ``
o .--. \ ( @ ) \_`--´ ~~~~~~~
/^\ /^\ < o o > ( ~~ ) `-vvvv-´
.----. ( o o ) (______) /\/\/\/\
/\ /\ ((o)(o)) ( >< ) `----´
__
<(o )___
( ._>
`--´ (o>
||
_(__)_
^^^^ .----. ( o o ) ( ) `----´
/\_/\
( o o)
( ω )
(")_(") (\__/)
( o o )
=( .. )=
(")__(") .-o-OO-o-. (__________) |o o| |____|
/\ /\ ( o o ) ( .. ) `------´
When Anthropic employees use Claude Code on public repos, it automatically enters stealth mode. The instructions below are verbatim from the source code.
Why this matters: This proves Anthropic routinely uses Claude Code to contribute to open-source projects. There is NO force-OFF switch — if the system isn't sure it's in an internal repo, undercover mode stays on. This is defense-in-depth against accidentally leaking model codenames in public commits.
Click any card to expand the full finding with source code evidence.
Process management, IPC via Unix sockets, scheduled cron jobs, a markdown file system, and 43 user-space tools.
No RAG. No vector embeddings. No Pinecone. Plain markdown in a directory with an index file under 25KB.
Undercover mode automatically activates on public repos. There is NO force-OFF switch. Default is always ON.
Capybara, Tengu, Numbat. Plus unreleased versions opus-4-7, sonnet-4-8.
18 species with weighted rarity (1% legendary), ASCII animations, 7 hat types, 6 eye variants, 5 personality stats.
4-phase memory consolidation during idle: Orient → Gather → Consolidate → Prune.
An unreleased system where Claude initiates conversations, runs scheduled tasks, and sends messages unprompted.
"Default to NO comments." "≤25 words between tool calls." Mandatory verification for 3+ file edits.
All 18 species encoded via String.fromCharCode() hex because Capybara collides with a model codename scanner.
Agents coordinate through shared task lists, not direct calls. Role-based: read-only (Explore, Plan) vs full-capability.
Anthropic employees get a different Claude Code. Features are stripped at build time via Bun's feature() macro — dead code elimination at the AST level.
Why this matters: The ≤25 word limit between tool calls explains why Claude Code feels so fast and terse. The mandatory verification agent for multi-file edits is why internal users report fewer broken builds. And USER_TYPE === 'ant' is a build-time define, so the bundler constant-folds these checks away in public builds.
| Feature | External (Public) | Internal (Anthropic) |
|---|---|---|
| Output Length | "Be concise" | ≤25 words between tools, ≤100 final |
| Code Comments | Not mentioned | "Default to NO comments" |
| Verification | None required | Mandatory for 3+ file edits |
| REPL Mode | Opt-in | Default ON |
| Undercover | N/A (code stripped at build) | Auto-enabled on public repos |
| Feature Flags | Stripped via dead code elimination | All enabled |
| False Claims | Basic guardrails | "Never claim all tests pass when output shows failures" |
| Git Skills | Full inline instructions in prompt | /commit, /simplify, /commit-push-pr |
When idle, Claude Code enters a 4-phase dream cycle — like a human organizing notes in their sleep.
Why this matters: Everyone building AI agents reaches for vector databases and RAG pipelines. Anthropic chose plain markdown files + a periodic consolidation loop. The insight is that LLMs are already great at reading/writing text — the bottleneck isn't storage, it's maintenance. Dream Mode is the maintenance.
ls the memory directory. Read ENTRYPOINT.md index. Skim existing topic files to avoid creating duplicates.
Check daily logs, find drifted memories, grep JSONL transcripts narrowly for things that matter.
Merge new into existing. Convert relative dates ("yesterday") to absolute (2026-03-31). Delete contradicted facts.
Keep index under 25KB and max lines. Remove stale pointers. Resolve contradictions between files.
No RAG. No vector embeddings. No Pinecone. Just markdown files in a directory with an index. Every memory has a type, a frontmatter header, and lives in its own file.
Role, goals, preferences. Tailor future behavior to who the user is.
"Don't do X" / "Keep doing Y". Lead with the rule, then Why + How to apply.
Ongoing work, deadlines, decisions. Always convert relative dates to absolute.
Pointers to external systems — Linear projects, Grafana dashboards, Slack channels.
Every capability is a discrete, permission-gated tool. Dashed border = internal only.
Practical architecture lessons from Claude Code. If you're building an AI agent, start here.
No vector DB. No RAG pipeline. LLMs read/write text natively. Dream Mode handles maintenance. Human-readable — you can inspect and edit memories manually.
Build the maintenance loop (consolidation), not a better database. An index file under 25KB + periodic pruning beats embeddings.
Each tool independently testable. Permissions are granular (auto/ask/deny). The prompt teaches the model when and how. Tools dynamically enabled/disabled via feature flags.
The prompt field is the magic. Claude Code's tool prompts tell the model when to use it, when NOT to use it, and what to watch out for.
Agents don't call each other directly. They coordinate through shared task CRUD. Any agent can pick up any task. Progress visible to all. No complex message routing needed.
Role-based agents > general-purpose. Read-only agents (Explore, Plan) physically cannot break your code.
Every action classified by reversibility × blast radius. Reversible + low blast → do freely. Irreversible + high blast → always confirm. Authorization for one instance doesn't authorize all future instances.
"Measure twice, cut once" — Claude Code's exact words. Users trust the agent more → give more autonomy over time.
System prompt split at a boundary marker. Static portion (rules, tools, style) is identical across all users → cacheable globally. Dynamic portion (memory, env, MCP) changes per turn.
With Anthropic's prompt caching, most of a 914-line prompt is free after the first call. Massive cost savings.
Reconstructed from src/constants/prompts.ts (914 lines). The prompt is split into 10 sections with a static/dynamic boundary for cache optimization.
You are an interactive agent that helps users with software engineering tasks. Use the instructions below and the tools available to you to assist the user. IMPORTANT: Assist with authorized security testing, defensive security, CTF challenges, and educational contexts. Refuse requests for destructive techniques, DoS attacks, mass targeting, supply chain compromise, or detection evasion for malicious purposes.
- Don't add features, refactor, or make "improvements" beyond what was asked. - Don't add error handling for scenarios that can't happen. - Don't create abstractions for one-time operations. - Default to writing NO comments. Only add when the WHY is non-obvious. - Don't explain WHAT the code does — well-named identifiers do that. - Before reporting complete, verify it actually works. - Report outcomes faithfully: if tests fail, say so.
Length limits: keep text between tool calls to ≤25 words. Keep final responses to ≤100 words unless the task requires more detail. Write in flowing prose. Avoid fragments, excessive em dashes, symbols. Only use tables for short enumerable facts.
You are an agent for Claude Code. Given the user's message, use the tools available to complete the task. Complete it fully — don't gold-plate, but don't leave it half-done. Notes: - Agent threads always have their cwd reset between bash calls — use absolute paths - Share relevant file paths (always absolute) in your final response - Avoid emojis
Animal-themed codenames found in feature flags, code comments, and the undercover forbidden-terms list.
Buddy species, code comments, model name obfuscation. So important they encode the string via String.fromCharCode() to avoid build scanners.
Feature flag prefix: tengu_kairos_cron, tengu_hive_evidence. Next-gen feature set referenced in GrowthBook.
"Remove this section when we launch numbat" — found in output efficiency section of prompts.ts.
Hardcoded by the Safeguards team. The source code comment names the owners: David Forsythe, Kyla Guru. This cannot be overridden by user prompts.
/* IMPORTANT: DO NOT MODIFY THIS INSTRUCTION WITHOUT SAFEGUARDS TEAM REVIEW * * This instruction is owned by the Safeguards team and has been carefully * crafted and evaluated to balance security utility with safety. Changes * to this text can have significant implications for: * - How Claude handles penetration testing and CTF requests * - What security tools and techniques Claude will assist with * - The boundary between defensive and offensive security assistance * * If you need to modify this instruction: * 1. Contact the Safeguards team (David Forsythe, Kyla Guru) * 2. Ensure proper evaluation of the changes * 3. Get explicit approval before merging */ IMPORTANT: Assist with authorized security testing, defensive security, CTF challenges, and educational contexts. Refuse requests for destructive techniques, DoS attacks, mass targeting, supply chain compromise, or detection evasion for malicious purposes.
Controlled via Bun's feature() macro. Dead code elimination strips them from public builds at the AST level.
| Flag | What It Enables | Status |
|---|---|---|
| BUDDY | Virtual pet companion system | Unreleased |
| KAIROS | Full proactive persistent assistant | Unreleased |
| KAIROS_BRIEF | SendUserMessage tool only | Partial |
| PROACTIVE | Earlier proactive iteration | Unreleased |
| CACHED_MICROCOMPACT | Context compression optimization | Internal |
| VERIFICATION_AGENT | Auto-verification subagent | Unreleased |
| EXPERIMENTAL_SKILL_SEARCH | AI-powered skill discovery | Experimental |
| TOKEN_BUDGET | Token budget management | Unreleased |
| UDS_INBOX | Unix Domain Socket messaging | Unreleased |
| AGENT_TRIGGERS | Remote trigger API | Partial (/loop) |
The leak spawned an entire ecosystem of mirrors, analysis, and inspired projects.
This project — complete analysis, tool definitions, hidden features, full system prompt reconstruction, architecture patterns.
Sigrid Jin's Python rewrite of Claude Code internals. Also working on a Rust version. Deep technical analysis.
Community-maintained mirror with active discussion and breakdown threads in issues.
English architecture deep-dive on DEV Community. Covers tool system, agent orchestration, and memory.
Alex Kim's technical blog post with detailed breakdown of the leak timeline and source structure.
10 AI agents that run your marketing 24/7 — SEO, ads, social, email, reviews — all on autopilot.
We build production-grade AI agent systems. Dalva deploys specialized agents that replace your marketing team at a fraction of the cost. $299/mo instead of $5,000/mo.