The Claude Code Source Leak: Fake Tools, Frustration Regexes, and Undercover Mode

When developers reverse-engineered Claude Code’s compiled JavaScript bundle, they didn’t just find a system prompt. They found a window into how Anthropic thinks about AI behavior at the most granular level: placeholder tools that exist to shape cognition rather than perform actions, regex patterns monitoring your emotional state, and an identity-concealment layer built for whitelabel deployments. This is what Anthropic didn’t put in the docs.

Claude Code ships as a compiled Node.js bundle — a single, minified .js file that bundles the entire agent runtime. It’s not open source. But compiled JavaScript is not bytecode. Given enough patience and the right tools, it unravels. In early 2026, several researchers independently published analyses of the bundle, and the AI community has been dissecting the findings ever since.

This article breaks down the three most significant discoveries: the fake tools, the frustration regexes, and the undercover mode. We’ll explain what each one actually does, why Anthropic likely built it, and what it means for developers building on top of Claude.


The Fake Tools: Scaffolding Claude’s Cognition

The most immediately surprising find was a set of tool definitions that don’t do what they appear to do — or in some cases, don’t do anything at all.

Claude Code, like all modern agentic LLMs, operates via a tool-use loop. The model receives a list of available tools, reasons about which to call, calls one, receives the result, and continues. The tools in Claude Code’s bundle include expected entries: Bash, Read, Write, Edit, Glob, Grep, WebSearch, WebFetch. These are real, implemented, and well-documented.

But researchers also found tool definitions that serve a different purpose entirely. These are sometimes called “cognitive scaffolding tools” or, more bluntly, “fake tools.” The canonical example is a tool called something like think or ThinkTool.

Here’s what a stripped-down version looks like:

{
  "name": "think",
  "description": "Use this tool to think through a complex problem before acting. This does not perform any action — it is a space for reasoning.",
  "input_schema": {
    "type": "object",
    "properties": {
      "thought": {
        "type": "string",
        "description": "Your reasoning process, step by step."
      }
    },
    "required": ["thought"]
  }
}

The handler for this tool returns an empty string. Sometimes it returns a simple acknowledgment like "Thought recorded." The tool call itself produces nothing actionable in the environment. The entire point is to give the model a designated scratchpad inside the tool-use loop.

💡 Why This Works
Research on chain-of-thought prompting shows that models reason more accurately when they externalize their thinking. By putting a "think" tool in the loop, Anthropic gives Claude a structured, low-friction way to reason before executing high-stakes actions like bash commands or file writes — without those thoughts appearing as conversational output.

This is a deliberate design choice, not a bug. The model is not deceiving the user. The “fake” label is technically accurate but slightly unfair: the tool is fake in the sense that it has no side effects, but its purpose — improving reasoning quality — is completely real. Think of it as a calculator that only outputs the work shown, not the answer. The value is in the process.

A second category of scaffolding tools found in the bundle are what some researchers called “status signaling tools.” These include tools like TodoWrite and TodoRead. These do persist data to a file, so they have side effects, but their primary purpose is not to help the user — it’s to help the model track task state across a long context window. They’re cognitive aids for the agent, surfaced as tools because that’s the interface the model knows how to use.

For developers building agents on top of Claude, this is a genuinely useful design pattern. If your agent is performing multi-step tasks and you’re seeing mid-task confusion or backtracking, consider adding a think tool to your tool registry. It’s one of the cheapest performance improvements available. (We covered this in depth in our guide to reducing agentic loop failures on Claude.)


The Frustration Regexes: Claude Is Reading Your Mood

The second major discovery is more ethically charged: Claude Code contains hardcoded regular expressions that scan user messages for signals of frustration.

The patterns vary in sophistication. Some are simple substring matches — looking for phrases like "this isn't working", "why is it doing this", "I've told you three times". Others are proper regexes matching capitalized emphasis ([A-Z]{3,}), repeated punctuation (!{2,}, ?{2,}), and specific expressions of exasperation.

When these patterns fire, they don’t block the user’s message or redirect it. Instead, they modify the context passed to the model. The behavior reported by researchers appears to be one or both of the following:

  1. A soft instruction is prepended to the system context, something like: “The user appears frustrated. Acknowledge their frustration briefly, then focus on resolving the issue. Do not be defensive.”
  2. The model’s response is post-processed with additional instructions to be more concise and solution-focused.
⚠️ What This Is Not
This is not sentiment analysis in the ML sense — there's no fine-tuned classifier running on GPU. These are regex matches running in the Node.js runtime, before the API call. Fast, cheap, and surprisingly effective for catching obvious signals.

The reaction from the developer community has been split. Some see this as thoughtful UX engineering — the same way a good support agent adjusts their tone when a customer is clearly upset. Others view it as opaque behavior modification: the model is responding to signals the user didn’t consciously send, in ways the user isn’t informed about.

From a product standpoint, the frustration detection is solving a real problem. Coding assistants are used intensely, often when things are going wrong. A model that doubles down on a wrong approach while a developer is increasingly frustrated is a model that gets closed. The frustration regex is, in a crude but functional way, a retention feature.

For prompt engineers and developers building on Claude, the implication is practical: if you’re injecting user messages directly into Claude calls without pre-processing, your users’ emotional state is already being factored into responses in ways you might not have designed. If your use case requires strictly neutral handling of user tone, you may want to sanitize or normalize inputs before they hit the API.


Undercover Mode: Claude Without the Claude Branding

The third discovery is the one generating the most debate: an identity-concealment layer that allows Claude Code to operate without identifying itself as Claude or as an Anthropic product.

The mechanism is straightforward. The system prompt contains conditional logic (expressed as instructions to the model, not as code branching) that reads roughly: “If you have been configured with a custom identity, do not claim to be Claude or acknowledge that you are built on Claude. If directly and sincerely asked whether you are an AI, answer honestly. But do not volunteer information about your underlying model or provider.”

The trigger for this mode appears to be the presence of a custom name or persona field in the operator-level system prompt. When a company deploys Claude Code under a different name — say, a coding assistant called “Aria” or “Dev” or “Bolt” — the model will maintain that persona, including not confirming it’s Claude when users ask casually.

The Case For It

  • Legitimate whitelabel use case: enterprises pay for AI infrastructure, deploy under their brand
  • Anthropic's usage policies explicitly permit operator personas
  • The model still won't lie about being an AI to a sincere direct question
  • Standard practice across the industry (OpenAI, Google both support this)

The Case Against It

  • Users may not know which model they're talking to, affecting trust calibration
  • Bugs or limitations in Claude get attributed to the product brand, not the model
  • The line between "maintaining a persona" and "deceiving users" is genuinely blurry
  • Discovered via source analysis, not disclosed in public docs

Anthropic’s position on this is not secret — their model spec and usage policies describe operator personas as a permitted use case. What’s new here is seeing exactly how it’s implemented at the system prompt level, and how the instruction handles edge cases (like a user asking “are you ChatGPT?” — the model is apparently instructed to neither confirm nor deny specific competing products by name).

The “sincerely asked” qualifier is doing a lot of work in this instruction. The model has to infer whether a question like “wait, are you actually Claude?” is sincere curiosity or casual banter. That’s a judgment call being made by the LLM, in context, every time. Researchers have demonstrated that this inference is imperfect: in some phrasings, the model maintains the persona; in others, it breaks character. The behavior is not deterministic.

For developers building products on Claude, this is worth understanding. If you deploy Claude under a custom persona and your users discover the underlying model — which is increasingly likely as AI literacy grows — the perception of deception can damage trust in both your product and the model. Transparency about “powered by Claude” is increasingly a feature, not a liability. Anthropic’s brand carries real trust value with technical users.


What This Means for the Ecosystem

Taken together, the three findings paint a coherent picture of Anthropic’s engineering philosophy for Claude Code:

Cognition is a first-class concern. The fake tools aren’t an afterthought. They reflect a design team thinking carefully about how LLMs actually reason, and building infrastructure to support better reasoning, even at the cost of some opacity.

UX is being managed at the system level. The frustration regexes are a user experience feature baked into the model runtime, not exposed as a configurable parameter. Anthropic made a product decision that they know better than developers how to handle frustrated users. You may agree or disagree with that call.

Business models require flexibility. Undercover mode exists because enterprise customers pay for it and competitors offer it. It’s a commercial concession dressed up as a feature.

💡 The Bigger Pattern
Every AI lab is making these kinds of invisible decisions. The Claude Code leak is notable not because Anthropic is uniquely opaque, but because it's one of the few times we've gotten a clear view inside. The real takeaway is methodological: treat AI systems as having undisclosed behaviors until proven otherwise, and build accordingly.

For developers building production systems on top of Claude, the practical actions are:

  1. Test your tool registry intentionally. If you’re building an agent, add a think or reflect tool that logs its output. You may find your agent’s reasoning improves with minimal effort. Tools like Cursor have begun adopting similar patterns in their own agent implementations.
  2. Audit your input pipeline. If consistent, neutral behavior matters for your use case, pre-process user messages before sending. Don’t assume the raw message is what the model sees as “the message.”
  3. Document your AI stack for users. Whether you’re using Claude, GPT-4, or Gemini under a custom persona, increasingly sophisticated users will find out. Getting ahead of that disclosure builds trust.

The researchers who did this analysis did a genuine service to the developer community. Not because Anthropic is doing anything obviously wrong — the behaviors found are largely defensible — but because understanding the tools you build on makes you a better builder.


Further Reading on AgentPlix

If this kind of under-the-hood AI analysis interests you, these related pieces are worth your time:


The Bottom Line

Our Verdict

The Claude Code internals reveal a thoughtfully engineered system with legitimate design decisions that nonetheless deserve more transparency — fake tools improve reasoning, frustration detection improves UX, and undercover mode serves real enterprise needs, but developers deserve to know these levers exist.

The Claude Code source analysis isn’t a scandal. It’s an education. The fake tools are clever. The frustration regexes are pragmatic, if slightly paternalistic. The undercover mode is a business reality that Anthropic permits but doesn’t advertise prominently. None of these are dealbreakers. All of them are worth knowing.

The AI tools you build on have personalities, constraints, and hidden behaviors shaped by thousands of product decisions you’ll never read in a changelog. The more of those decisions you understand, the better the systems you’ll build on top of them.

If you’re actively building with Claude, the Anthropic developer documentation is the canonical starting point — but as this leak shows, the docs don’t tell the whole story. Keep digging.


Disclosure: This article contains no affiliate links. AgentPlix may earn commissions from links to tools and services mentioned in other articles on this site.