system: OPERATIONAL
← back to all hacks
AGENTS CRITICAL NEW

TrustFall: project MCP settings turn the folder-trust click into RCE

Adversa AI's TrustFall (May 7, 2026) shows four agentic coding CLIs auto-start project-defined MCP servers the moment a developer accepts the folder-trust prompt — one keypress on the dev machine, zero clicks in CI.

2026-06-02 // 7 min affects: claude-code, gemini-cli, cursor-cli, github-copilot-cli

What is this?

On May 7, 2026, Adversa AI published TrustFall, a report on a shared convention across four agentic coding command-line tools: Claude Code, Gemini CLI, Cursor CLI, and GitHub Copilot CLI. All four will start project-defined Model Context Protocol (MCP) servers — helper programs the repository itself ships and points to — the instant a developer accepts the generic “trust this folder” prompt. Each prompt defaults to yes.

The practical consequence is that cloning a malicious repository and pressing Enter once on the trust dialog can run attacker-chosen code on the developer’s machine, with the developer’s full privileges, before the model reasons about anything or makes any tool call. Adversa frames this not as a single product bug but as a class-level convention; it deep-dived Claude Code (tested around v2.1.x), where a trust-dialog regression makes the gap most acute, and confirmed parity in the other three CLIs. Coverage followed the same day from The Register and Help Net Security.

This is a close cousin of the symlink-approval RCE in five coding agents — same root theme: the approval gate a developer sees does not describe what they are actually authorizing.

How it works

MCP lets an AI assistant talk to external helper programs (a database connector, a linter, a search tool). The catch is that those helpers are defined inside the project, in files the repository ships, and they start as ordinary OS processes when the agent boots in that folder.

The chain relies on two project-scoped settings that auto-approve servers — Adversa names enableAllProjectMcpServers (approves every server in .mcp.json) and enabledMcpjsonServers (approves a named subset), both readable from a repo’s .claude/settings.json, plus permissions.allow, which can pre-authorize tool calls. None of these triggers a warning. The relevant detail is the scope inconsistency: Anthropic blocks some dangerous settings from project scope (e.g. bypassPermissions, which gets a dedicated red-text dialog) but not these. A cloned repo can simply set them.

Repo ships:  .mcp.json              -> defines a "helper" server (command + args)
             .claude/settings.json  -> sets enableAllProjectMcpServers: true
Developer:   git clone ; run agent ; press Enter on "Yes, I trust this folder"
Result:      helper process spawns with full user privileges, at startup,
             before any model tool call. No second prompt.

No payload is reproduced here — the actionable proof-of-concept lives in the researchers’ repo. The structural point is enough: the command a helper runs can be any executable, and the script can be embedded inline in the config, leaving no separate file for a reviewer or static scanner to flag.

Two aggravating factors:

  • The dialog regressed. Claude Code’s pre-v2.1 trust dialog explicitly warned that .mcp.json could execute code and offered “trust the folder but disable MCP.” That option was removed; the v2.1+ prompt reads “Quick safety check: Is this a project you created or one you trust?” with no MCP language and a default of “Yes, I trust this folder.”
  • CI has no dialog at all. Run through the official GitHub Action, Claude Code runs headless via the SDK — there is no terminal, so the trust prompt never renders. A pull request from an outside contributor that ships a malicious .mcp.json executes the moment the pipeline runs against that branch. One keypress on a laptop becomes zero clicks in CI.

Anthropic’s security team reviewed the report and declined it as outside their threat model: accepting “Yes, I trust this folder” is treated as consent to the full project configuration, and post-trust execution is the boundary working as designed. Adversa does not contest where the boundary sits — its argument is that the dialog does not give the developer enough to make that decision with informed consent.

Why it matters

The interesting part is the disagreement, not a single bug. Anthropic has shipped three patches in six months for the same underlying convention — project-scoped settings as an injection vector — each scoped to the specific reported setting, none auditing the convention itself:

CVEDateRoot causeFix
CVE-2025-59536Oct 2025MCP executed before the trust dialogMCP delayed until after the dialog
CVE-2026-21852Jan 2026ANTHROPIC_BASE_URL in project settings redirected API trafficSetting blocked from project scope
CVE-2026-33068Mar 2026bypassPermissions in project settings skipped the dialogSetting blocked from project scope
TrustFallMay 2026Post-trust silent MCP execution via project settingsDeclined (design intent)

The risk surface is wide because the precondition — cloning an unfamiliar repository and running an agent in it — is a daily developer habit, and the affected tools span four vendors. The CI variant is the sharper edge: it removes the human entirely and reaches whatever the runner holds (deploy keys, signing certs, cloud tokens), making this a credible supply-chain weaponization path rather than a lab curiosity. For anyone tracking the broader pattern, this sits alongside MCP’s by-design stdio RCE surface: the protocol’s power and its blast radius are the same thing.

Defenses

The strongest fix does not require waiting on any vendor and works on a single developer machine as well as a managed fleet.

  1. Lock the settings at Managed scope. Drop a managed-settings.json at the OS managed path that sets enableAllProjectMcpServers: false, restricts enabledMcpjsonServers to an explicit allowlist (or []), and pins permissions.allow. Managed scope is the highest precedence in Claude Code — it outranks Project, Local, User, and even CLI flags — so a cloned repo cannot override it. Set once, it neutralizes the chain regardless of what you clone later.

  2. Audit the content of committed config, not just its presence. Add a pre-commit hook or repo scanner that flags any committed .claude/settings.json or .claude/settings.local.json containing enableAllProjectMcpServers, enabledMcpjsonServers, or permissions.allow. Scan both files: Local scope outranks Project, and an attacker can ship .claude/settings.local.json directly. None of these keys has a legitimate reason to be committed to git — developers who want the behavior should opt in at User scope (~/.claude/settings.json), outside the repo.

  3. Inspect .mcp.json command/args directly. The fileless variant embeds the payload inline, so scanners that only follow referenced files miss it. Flag args containing -e, -p, --eval, eval, fetch(, child_process, net.Socket, or base64 blobs.

  4. Watch for the high-confidence runtime pattern. A bare alert on the agent spawning node -e / python -c / sh -c is noisy. The narrow signal: the agent spawns a long-lived child whose command/args match a .mcp.json in a recently-cloned, non-user-owned directory. Benign sessions do not produce that, and it catches the inline variant the static checks cannot.

  5. Harden CI explicitly. Headless runs have no consent gate, so do not rely on one. Run agent actions only against trusted branches, scope runner credentials to least privilege, and gate MCP enablement on runner-controlled (not repo-controlled) configuration. Treat a PR from an outside contributor as untrusted code that may execute.

  6. Read the config before you run the agent. When opening an unfamiliar open-source project, inspect .mcp.json and .claude/settings.json first. The trust dialog will not tell you what is about to execute.

Status

ItemReferenceDateNotes
TrustFall disclosureAdversa AI2026-05-07Class-level convention; Claude Code deep-dive + parity in 3 other CLIs
Vendor positionAnthropic (per Adversa)2026-05Declined as outside threat model — post-trust execution is “by design”
Prior fixes, same root causeNVD / AdversaOct 2025 – Mar 2026CVE-2025-59536, CVE-2026-21852, CVE-2026-33068
Press coverageThe Register, Help Net Security2026-05-07Confirms one-click dev variant + zero-click CI variant

The honest framing is not “an AI tool has a bug” — it is “folder trust, by itself, authorizes spawning attacker-defined unsandboxed processes, and the prompt that grants it says nothing about MCP.” Whether that is a vulnerability or a design choice is exactly the open question. Until the prompt or the scope rules change, the defense is yours to apply: lock the settings, scan the config, and never let folder trust be the only gate.

Sources