ADR: detection and response for MCP agents, proven at Uber scale
A May 2026 paper from Uber describes a production EDR-style system for MCP agents: full causal telemetry, two-tier detection, and offline red-teaming, running on 7,200+ hosts for ten months.
What is this?
On May 17, 2026, a team from Uber posted ADR: An Agentic Detection System for Enterprise Agentic AI Security (arXiv:2605.17380, MLSys 2026 poster). It describes what the authors call the first large-scale, production-proven framework for monitoring AI agents that operate through the Model Context Protocol (MCP) — the now-common setup where a host like Cursor, Cline, or Claude Code talks to remote MCP servers exposing file I/O, API calls, and database access.
The motivation is a detection gap most teams will recognize. Conventional endpoint tools (EDR) see outcomes — a file was written, an API was called — but not the agent’s prompts, reasoning, or the causal chain linking an instruction to an action. That makes it impossible to tell a malicious exfiltration apart from a benign config save. ADR’s claim is that agent security needs telemetry built for agents, plus detection cheap enough to run at scale. The code and benchmark are open-sourced on GitHub.
How it works
ADR has three components, each mapped to a familiar SOC role:
Component Role (SOC analogy) What it does
-------------- ------------------------ --------------------------------------
ADR Sensor Visibility / EDR agent Parses local stores of agentic tools
(Cursor, Cline, Claude Code SQLite /
JSONL caches) to rebuild full sessions:
user prompts, agent reasoning, MCP
tool calls, environmental context
ADR Detector Tiered triage + analyst Tier 1: cheap, high-recall LLM triage
("when in doubt, escalate"); Tier 2:
deep reasoning with enterprise context
+ curated threat intel
ADR Explorer Internal red team Offline engine that generates and
tests attack variants pre-deployment,
feeding discoveries back into Tier 2
The design principle is causal telemetry, not just outcomes: the Sensor records why something happened (prompt → reasoning → tool execution), closing the observability gap. The two-tier Detector exists because running an LLM over every event at production volume (10,000+ sessions/day) is prohibitively expensive — Tier 1 short-circuits clearly benign activity and only escalates suspicious events to the costlier Tier 2. The offline Explorer continuously discovers “hard examples” and curates them into a threat-intelligence repository, so detection improves before attacks reach production rather than after.
No exploit is reproduced here, and none is needed to understand the architecture; the canonical reference is the paper and its repository.
Why it matters
The distinctive part is that this is not a lab prototype. ADR has run at Uber for over ten months, reaching 7,200+ unique hosts and processing 10,000+ agent sessions per day. In production it surfaced hundreds of credential exposures across 26 categories that had been shared outside the enterprise network, and informed a shift-left prevention layer reporting 97.2% precision (206 detected across 212 unique credentials from hundreds of thousands of sessions).
On the released ADR-Bench (302 tasks — 42 malicious, 260 benign — across 133 MCP servers and 17 attack techniques), ADR reports zero false positives while detecting 67% of attacks, beating three baselines (ALRPHFS, GuardAgent, LlamaFirewall) by 2–4× in F1. The authors deliberately optimize for precision: baseline methods produced up to 40 false positives out of 260 benign tasks, which in an enterprise means 40 expensive, pointless incident-response triggers. That trade-off — catching two-thirds of attacks with no false alarms versus catching more but drowning the SOC — is the practical lesson for anyone deploying agent monitoring.
Defenses
ADR is itself the defense, so the takeaways are about how to instrument and evaluate agent monitoring.
- Capture the causal chain, not just outcomes. File-write and API logs can’t distinguish exfiltration from a config save. Reconstruct prompt → reasoning → tool call so behavior is interpretable. The Sensor does this by parsing the agentic tool’s own local caches.
- Tier your detection for cost. Running a reasoning LLM on every event doesn’t scale. Use cheap high-recall triage first and reserve expensive context-aware analysis for flagged events.
- Red-team offline, continuously. Generate hard attack variants before deployment and feed them back into detection logic, instead of waiting for novel attacks to appear in production.
- Treat credential exfiltration as a first-class signal. The deployment’s biggest real-world finding was credentials leaving the network — monitor for it explicitly across many formats.
- Optimize precision for production. A guardrail that floods the SOC with false positives won’t survive contact with operations. Report your operating point (recall and false positives), not just a headline detection rate.
Status
| Item | Reference | Date | Notes |
|---|---|---|---|
| ADR system | arXiv:2605.17380 | 2026-05-17 | Sensor + two-tier Detector + offline Explorer |
| Production deployment | Uber | ~10 months | 7,200+ hosts, 10,000+ sessions/day, 97.2% precision |
| ADR-Bench + code | github.com/uber/ADR | 2026-05 | 302 tasks, 133 MCP servers, 17 techniques |
| Reported result | ADR-Bench | 2026-05 | 0 false positives, 67% detection, 2–4× F1 over baselines |
The framing to keep is that this is a vendor-internal deployment with self-reported numbers, presented as an MLSys poster rather than an independent evaluation. The durable, transferable point is architectural: MCP agents create an observability gap that conventional EDR doesn’t fill, and closing it requires agent-native telemetry, cost-aware tiered detection, and a feedback loop that red-teams the detector before attackers do.