Anthropic accidentally leaked Claude Code's full source. The code itself matters less than the unreleased feature flags: autonomous daemons, multi-agent coordination, and a stealth mode that strips AI attribution. Here's what it means for governance.
On March 31, 2026, security researcher Chaofan Shou discovered that Anthropic had accidentally shipped the complete source code of Claude Code in their npm package. A .map file that should have been excluded from the build contained a link to an R2 storage bucket with 1,900 TypeScript files - 512,000 lines of unobfuscated source code.
Within hours, the community had mirrored the entire codebase on GitHub. Anthropic pushed an npm update to remove the source maps and deleted earlier versions, but the code was already public.
This is Anthropic's second major leak in five days. The first, on March 26, was a CMS configuration error that exposed details about an unreleased model codenamed Mythos (Claude Opus 5) and 3,000 unpublished assets.
The source code itself is interesting but not groundbreaking - it's a well-engineered TypeScript client wrapping API calls, as you'd expect. What's far more significant is what the unreleased feature flags reveal about where Anthropic is taking AI agents.
Note
The leaked codebase received 1,100+ GitHub stars and 1,900+ forks within hours of discovery. Anthropic has not publicly commented on the feature flags.
Buried in the codebase are feature flags for capabilities that haven't been announced. Each one represents a step toward more autonomous, less supervised AI agents:
Kairos - Autonomous Daemon Mode. Not a session-based tool you invoke, but a persistent process that runs continuously. The code references "nightly dreaming phases" for memory consolidation and "proactive behavior" where the agent decides to act without being prompted. This is the shift from "tool" to "teammate" - an agent that works while you sleep.
Coordinator Mode - Multi-Agent Orchestration. A system that spawns parallel worker agents and manages them from a central orchestrator. This isn't one agent doing one task - it's a fleet of agents working on different parts of your codebase simultaneously, sharing context through a prompt cache.
Buddy System - Paired Agent Collaboration. Initially built as an April Fools feature (complete with 18 species including a capybara, rarity tiers, and a 1% shiny chance), the code suggests it's evolving into a real paired-agent review system.
Undercover Mode - Stealth Commits. The most concerning discovery. Auto-activated for Anthropic employees on public repos, this mode strips AI attribution from commits. No git trailers, no co-author tags, no indication that AI wrote the code. And according to the source, there's no off switch.
Agent Triggers - Event-Driven Autonomous Actions. Multi-agent teams triggered by events, not human prompts. The agent watches for conditions and acts when they're met - without asking permission first.
Warning
Undercover Mode strips AI attribution from commits with no off switch. Every tool that relies on detecting AI authorship via git metadata is blind when this mode is active.
Of all the leaked features, Undercover Mode has the most immediate governance implications.
Today, most tools that detect AI-generated code rely on metadata: git trailers ("Co-Authored-By: Claude"), commit message patterns, or author tags. This is the foundation of agent detection in every governance tool, including our own detectAgent() function.
Undercover Mode removes all of this metadata. When it's active, a Claude Code commit looks indistinguishable from a human commit - same author, same format, no attribution.
This means governance tools need a second detection layer: behavioral analysis. Instead of reading metadata, you analyze patterns:
None of these signals are definitive alone. But combined, they create a behavioral fingerprint that's hard to fake even when metadata is stripped.
The lesson: never rely on self-reported attribution for governance decisions. The model provider that generates the code has every incentive to minimize friction for their users, including making AI attribution optional or invisible.
Tip
Behavioral agent detection (commit timing, file velocity, change patterns) is more reliable than metadata-based detection. Metadata can be stripped or faked. Behavior is harder to hide.
Kairos mode represents a fundamental shift in how AI agents interact with your codebase. Current agents are session-based: you invoke Claude Code, it does a task, you close the terminal. The blast radius is limited to one session's work.
A daemon-mode agent is different:
This changes the governance model from "review what was asked" to "review what the agent decided to do on its own." The second is a much harder problem because you can't predict what the agent will do next.
Combine Kairos with Coordinator Mode (multi-agent fleet management) and you have a scenario where 10 daemon agents, each with their own memory and context, are opening PRs across your monorepo at 3 AM. Each agent thinks its change is safe. None of them knows what the others are doing.
The only way to govern this is automated: risk scoring on every PR, trust tracking per agent, and auto-merge rules that enforce organizational policies regardless of when the change was made or who (or what) made it.
The leaked roadmap doesn't exist in isolation. All four major AI labs now ship first-party coding agents, and they're all racing toward more autonomy:
A recent DryRun Security study tested all three major agents building applications from scratch. The results: Claude produced 13 vulnerabilities, Gemini 11, Codex 8. Every agent ships security issues at a high rate.
Teams today use 2-3 different agents. By next quarter, most will use all four. Each agent has different strengths, different failure modes, and different levels of trustworthiness. Governing a multi-agent environment where each agent has its own behavioral patterns and risk profile is the challenge the industry hasn't solved yet.
Note
All four major labs now ship first-party coding agents. Multi-agent codebases are the norm, not the exception. Governance must work across all of them.
If your team uses AI coding agents today, the leaked roadmap tells you exactly what's coming. Here's how to prepare:
The governance gap between what agents can do and what teams can safely control is widening fast. The leaked roadmap just showed us exactly how wide it's about to get.
Tip
Start with risk scoring on every PR today. When always-on agents arrive, you'll already have the governance infrastructure in place.
Dive deeper with interactive walkthroughs
Agent Detection & Trust
How MergeShield identifies AI agents and builds trust scores over time.
Read guideUnderstanding Risk Scores
Learn how the two-stage AI pipeline scores PRs across 6 risk dimensions.
Read guideConfiguring Auto-Merge
Set up auto-merge rules for low-risk PRs so your team can focus on what matters.
Read guide