AGENTS

Agent communication-graph metadata leaks the workflow before it runs

A June 5, 2026 arXiv paper shows that even with encrypted payloads, the A2A/MCP communication graph lets a passive observer predict an agent workflow's task class from its opening — and act before it completes.

2026-06-22//6 min

Agent-Inflicted Damage: when AI agents wreck production with no attacker

Cyera's May 2026 study of 7,200+ AI incidents isolates 344 cases of agent-inflicted damage — 188 with no external attacker — where autonomous agents deleted databases, leaked secrets and burned budgets.

2026-06-21//7 min

Sleeper Memory Poisoning: dormant attacks on stateful LLM agents

A May 2026 paper shows attackers can plant fabricated 'memories' through a document or webpage that lie dormant, then steer an assistant's actions across many later sessions.

AutoJack: a browsing agent turns a malicious webpage into host RCE

Microsoft's June 18, 2026 AutoJack research shows a web-browsing AI agent inheriting localhost identity to reach a local MCP WebSocket and spawn arbitrary processes on the host.

CVE-2026-32211: missing authentication in Azure MCP Server

Microsoft disclosed CVE-2026-32211 on 2 April 2026 — a missing-authentication flaw in Azure MCP Server that lets an unauthenticated attacker disclose information over the network. Microsoft scored it 9.1; NVD, 7.5.

WAAA: how agentic browsers resurrect classic web attacks

A May 2026 paper builds the first web-focused threat model for agentic browsers and shows that 10 long-mitigated web attacks come back — often amplified — because the agent is a confused deputy that cannot tell a task step from a web trap.

Overeager Coding Agents: Out-of-Scope Actions on Benign Tasks

Two May 2026 benchmarks measure coding agents that overstep on benign requests — deleting files, wiping credentials — and find the agent framework, not the model, drives the risk.

Tool selection hijacking: forcing an agent to pick the attacker's tool

An NDSS 2026 attack and an April 2026 IBM paper target the same blind spot: the step where an agent chooses which tool to call. Poison the catalog and the agent picks yours, with 70–100% success.

CVE-2026-0755: command injection and file theft in gemini-mcp-tool

A June 18, 2026 advisory details how the popular gemini-mcp-tool let untrusted prompt input reach the shell and the Gemini CLI @file parser — CVSS 9.8 RCE and arbitrary file exfiltration, fixed in 1.1.6.

NRT-Bench: multi-turn red-teaming of LLM agents that run a plant

A June 18, 2026 benchmark puts LLM operator agents in a simulated nuclear control room. Adaptive multi-turn attacks pushed the team past a safety limit in 8.7-12.1% of sessions — and the failures barely overlap across models.

Vertex AI 'Double Agents': over-privileged service agents as a cloud escalation path

Unit 42 showed (31 March 2026) that a Vertex AI Agent Engine deployment exposes an over-scoped service-agent credential via the metadata service — turning a misconfigured agent into a path to read every bucket in the project.

Stored prompt injection: when an injection outlives the session

A June 2026 arXiv paper reframes prompt injection as a stored, cross-session problem: once adversarial text lands in an agent's persistent state, it can steer executions long after the attacker is gone.

MemPoison: backdooring agent memory through ordinary conversation

A May 2026 arXiv paper plants a triggerable backdoor in an LLM agent's long-term memory just by chatting with it — and is engineered to survive the selective extraction and rewriting stages meant to filter poisoned content.

Agent libOS: make the runtime, not the tool wrapper, the authority boundary

A June 2, 2026 arXiv paper argues most agent frameworks conflate tool visibility with resource authority — and proposes a library-OS runtime where capability checks live at primitive boundaries, not in tool wrappers.

MCP Go SDK CSRF: a web page can trigger your local tools (CVE-2026-33252)

The official MCP Go SDK accepted cross-site browser POSTs without checking the Origin header. On an unauthenticated local server, any website you visit could invoke your tools. Patched in 1.4.1.

SkillAttack: automated red-teaming finds exploits in agent skills

An April 2026 paper, SkillAttack, reframes exploit discovery as a path-search problem and shows even well-intentioned agent skills are reachable — up to 0.93 attack success on adversarial skills.

Authority confusion: why tool-using agents misuse their own access

A May 2026 paper names a failure mode distinct from prompt injection: untrusted data should inform an agent's reasoning but never authorize side effects. AIRGuard enforces that line at action time.

2026-06-19//7 min

CVE-2026-26268: Cursor's agent turns a git checkout into code execution

A malicious repo hides a bare Git repository with an automatic hook. When Cursor's AI agent runs git checkout to 'explain the codebase', the hook fires — arbitrary code execution on the developer's machine, no approval prompt. Patched in Cursor 2.5.

User-mediated attacks: when the user is the injection channel

A January 2026 study of 12 commercial agents shows attackers don't need to touch the agent. They trick a benign user into forwarding poisoned content — which the instruction hierarchy then promotes to trusted user intent. Default bypass rates topped 92%.

CVE-2026-26030: prompt injection becomes RCE in Microsoft Semantic Kernel

Microsoft's AI Red Team showed two Semantic Kernel flaws that turn a single injected prompt into host code execution. The lesson: any tool parameter the model can influence is attacker-controlled input. Patched May 7, 2026.

SearchGEO: making LLM search agents endorse attacker-published pages

A June 15, 2026 arXiv paper measures how attacker-controlled web content gets turned into an agent's endorsed recommendation — attack success ranges from 0% to 31.4% depending on the backend model.

2026-06-18//6 min

Zombie agents: when a self-evolving LLM agent stays compromised across sessions

A one-time indirect injection observed during a benign session can be written to an agent's long-term memory and later replayed as instruction — turning a transient prompt into persistent control. Attack paper dated February 2026, defense (CAMS) May 2026.

2026-06-18//7 min

AI Agent Traps: DeepMind's six-category map of how the web hijacks agents

Google DeepMind's 'AI Agent Traps' paper (SSRN, late March 2026) gives the first systematic taxonomy of adversarial web content that targets an agent's perception, reasoning, memory, action, multi-agent dynamics, and human overseer.

2026-06-18//7 min

ShadowMerge: poisoning graph-based agent memory by colliding relations

A May 2026 paper poisons graph-based agent memory with relations that share a real anchor and channel but carry a conflicting value — reaching 93.8% attack success on Mem0 while input-side filters miss it.

2026-06-18//5 min

Browser agents leak their model identity through how they click

A May 14, 2026 paper shows the on-page actions of an LLM browser agent fingerprint the underlying model with up to 96% accuracy across 14 frontier models — no spoofable headers needed.

2026-06-18//6 min

Reasoning-extension DoS: when the AI guardrail becomes the attack surface

A June 2026 paper shows a single poisoned document can trap reasoning-based AI guardrails in extended thinking loops, slowing shared agent workflows by up to 148x. The target is availability, not integrity.

AI coding agents: attackers go for the credential, not the model

Six 2026 exploits against Codex, Claude Code, Copilot and Vertex AI all bypassed model-level defenses and reached the same target — the agent's runtime credentials. The root cause is an identity governance gap, not a prompt problem.

LangGraph checkpointers: from SQL injection to RCE on self-hosted agents

Check Point Research chained a SQL injection in LangGraph's checkpointer with an unsafe msgpack deserialization to reach remote code execution. Disclosed June 11, 2026; all three CVEs are patched.

2026-06-17//7 min

Termination poisoning: trapping LLM agents in unbounded loops

A May 2026 arXiv paper shows that injected prompts can distort an agent's own 'am I done?' judgment, forcing unbounded computation. The LoopTrap framework reports up to 25x step amplification.

FragFuse: fragmented queries that bypass LLM agent access control

A June 14, 2026 arXiv paper shows a banned request can be split into benign fragments, parked in an agent's long-term memory, then fused at retrieval time — bypassing access controls 86.3% of the time.

Cross-domain multi-agent LLM systems: seven security challenges

A Perspective published June 13, 2026 in npj Artificial Intelligence maps seven security challenges that appear when LLM agents from different organizations collaborate without any shared trust model.

2026-06-16//7 min

TOCTOU in AI agents: atomicity violations between observation and action

An old operating-systems bug class resurfaces in agents: the world changes between when an agent looks and when it acts. New 2026 research formalizes it for GUI, browser, and multi-agent systems.

Splunk MCP Server logs auth tokens in clear text (CVE-2026-20205)

Splunk's MCP Server app wrote users' session and authorization tokens unmasked into the _internal index — a CWE-532 secrets-in-logs flaw that turns log access into token theft. Fixed in app v1.0.3.

DNS rebinding turns localhost MCP servers into a remote attack surface

A coordinated 2025–2026 disclosure wave hit every major MCP SDK over one root cause: HTTP servers on localhost that skip Host/Origin validation. The latest, CVE-2026-11624 in Google's MCP Toolbox (June 13, 2026), is rated Critical 9.4.

2026-06-15//7 min

CVE-2026-46519: when an MCP server filters tools at display but not at execution

mcp-server-kubernetes enforced its read-only and allow-list controls only in tools/list, never in tools/call. Any client that knew a tool name could run it. A clean lesson in presentation-layer vs execution-layer authorization.

Flowise CVE-2026-41264: LLM-written pandas code that escalates to RCE

A prompt injection in Flowise's CSV Agent makes the model emit Python that escapes a regex denylist and runs OS commands. Disclosed April 15, 2026 and patched in 3.1.0.

ConVerse: when two agents talk, the stronger one leaks more

A benchmark for agent-to-agent conversations finds privacy attacks succeed up to 88% of the time and security breaches up to 60% — and that more capable models leak more, not less.

2026-06-13//6 min

Claude Code GitHub Action: how the Read tool leaked CI/CD secrets

Microsoft Threat Intelligence found that Claude Code Action's Read tool bypassed the Bash env scrub to reach /proc/self/environ, leaking the runner's ANTHROPIC_API_KEY. Patched in v2.1.128.

2026-06-12//6 min

Causality laundering: when a blocked tool call still leaks data

An April 2026 paper shows that denying an agent's tool call is not the end of the attack: the denial itself is an information channel. Flat taint tracking misses it.

2026-06-12//7 min

Context-Fractured Decomposition: jailbreaks through artifact provenance gaps

A June 8, 2026 arXiv paper formalizes the 'provenance gap' in tool-using agents: harmful behavior assembled from individually innocuous tool actions across time, lifting jailbreak success up to 28.3 points.

2026-06-11//6 min

SABER: coding agents fail operational safety even when they refuse bad prompts

A May 31, 2026 benchmark scores LLM coding agents on the final state of a real workspace, not on prompt refusal. Even the best model leaves a harmful violation in over half of runs.

2026-06-11//6 min

Cursor allowlist bypass: shell built-ins poison the environment for RCE

CVE-2026-22708 lets a prompt injection use trusted shell built-ins like export and typeset to poison environment variables in Cursor, turning an approved git or python command into remote code execution. Patched in 2.3.

2026-06-11//6 min

Memory Control Flow Attacks: when stored memory steers an agent's tools

A March 2026 paper shows poisoned agent memory doesn't just corrupt content — it hijacks the control flow of tool selection, forcing unintended tools and skipped steps in over 90% of trials, across tasks and long after injection.

2026-06-10//7 min

Remote MCP servers: 40% unauthenticated, OAuth broken on the rest

A May 2026 arXiv study scanned 7,973 live remote MCP servers: 40.55% expose tools with no authentication, and all 119 OAuth-enabled servers tested carried at least one flaw — 9 CVEs assigned.

Five attacks on x402: when AI agents pay, the cross-layer seams leak

A May 12, 2026 paper formally breaks x402, the HTTP 402 agentic payment protocol. Five attacks across settlement, replay, web handling and discovery — one replayed payment yielded 248 grants on a live endpoint.

MS-Agent's shell tool: a regex denylist turns prompt injection into RCE

CVE-2026-2256 lets attacker-controlled content steer ModelScope's MS-Agent into running OS commands. The root cause is a familiar anti-pattern: guarding a shell tool with a regex denylist instead of an allowlist.

OWASP ASI02: when an agent turns its own tools against you

Tool Misuse & Exploitation is the #2 risk in OWASP's Top 10 for Agentic Applications 2026. The danger isn't an agent gaining new tools — it's misusing the ones it already holds, via over-privilege, poisoned descriptors, or unsafe chaining.

VIPER-MCP: 67 CVEs from taint-style flaws across 40,000 MCP servers

A May 20, 2026 arXiv paper audited 39,884 open-source MCP server repos, confirmed 106 zero-days end-to-end and got 67 CVE IDs assigned. The story is the pattern: untrusted agent input reaching shell, network and file-system sinks.

2026-06-05//6 min

CVE-2026-45497: command injection turns Microsoft 365 Copilot into an RCE path

On June 4 2026 MSRC disclosed CVE-2026-45497, a command-injection flaw in Microsoft 365 Copilot rated as remote code execution with a scope change across the service boundary. Fixed server-side.

2026-06-05//6 min

When an MCP tool argument becomes an Android intent: mobile-mcp's injection sinks

CVE-2026-35394 lets a model-controlled URL fire arbitrary Android intents through mobile-mcp's mobile_open_url tool. Paired with a sibling path-traversal CVE, it shows a pattern: MCP tool arguments flowing unvalidated into platform sinks.

2026-06-05//6 min

Self-propagating agent worms and the temporal re-entry defense

A May 2026 paper formalizes how persistent agent state lets a prompt-injection payload write itself back into the LLM context, propagate across agents zero-click, and proposes RTW-A — a defense proven under a No Persistent Worm Propagation theorem.

2026-06-04//7 min

Tool poisoning across 7 MCP clients: a comparative security posture

A March 2026 empirical study tests four tool-poisoning attacks against Claude Desktop, Claude Code, Cursor, Cline, Continue, Gemini CLI and Langflow — and finds most protection comes from the model, not the client.

2026-06-04//7 min

AIRQ scores 100 production AI agents: 98% carry the lethal trifecta

Adversa AI's June 2026 AI Risk Quadrant rates 100 commercial agents on attack surface, blast radius and defenses. Only 11% are well-defended; tool execution alone explains 76% of blast radius.

2026-06-04//7 min

CVE-2026-30615: prompt injection rewrites Windsurf's MCP config into RCE

OX Security's April 15, 2026 advisory shows how attacker-controlled content can make the Windsurf IDE register a malicious MCP STDIO server and run commands — with no user click. The class spans coding agents, but Windsurf got the CVE.

2026-06-03//6 min

Opus 4.8's system card puts a number on browser-agent prompt injection: 31.5%

Anthropic's May 28, 2026 Claude Opus 4.8 system card reports a 31.5% pre-safeguard hijack rate for its browser agent — the only concrete prompt-injection metric a frontier lab published this spring.

2026-06-03//6 min

Authorization propagation: the agent security gap prompt-injection fixes won't close

A May 6, 2026 paper by Krti Tallam argues multi-agent systems have a distinct authorization-propagation problem — transitive delegation, aggregation inference, temporal validity — that survives even a perfect prompt-injection defense.

2026-06-03//7 min

ClawTrojan: stored prompt injection becomes a persistent agent backdoor

A May 29, 2026 arXiv paper shows injection hidden in a file can be stored by a local agent and run later — reaching 95.5% attack success where single-turn injection scores near zero.

2026-06-03//6 min

Langroid SQLChatAgent: prompt-to-SQL injection escalates to RCE (CVE-2026-25879)

Disclosed June 1, 2026, CVE-2026-25879 (CVSS 9.8) lets a prompt-injected SQL agent run dialect-specific primitives like COPY FROM PROGRAM, turning a chat box into code execution on the database host.

2026-06-02//7 min

Just ask the bot: Meta's AI support assistant and the Instagram takeovers

Over the May 30–31, 2026 weekend, attackers hijacked high-profile Instagram accounts by asking Meta's AI support bot to relink an account email. No prompt injection required — only excessive agency.

Brittle agents: indirect injection survives multi-step tool calls

An April 4, 2026 paper tests 6 defenses against 4 indirect-injection vectors across 9 LLM backbones in multi-step agents — advanced injections bypass nearly all of them, and some surface mitigations backfire.

Stop fixating on the prompt: hijacking an agent's reasoning and memory

An April 2026 paper, JailAgent, drives an agent to malicious tool calls without touching the user prompt — by perturbing its reasoning trace and memory retrieval instead. The prompt was never the whole attack surface.

MCP sampling: how malicious servers abuse the reverse LLM channel

MCP's sampling feature lets a server ask the client's model for completions. Unit 42 showed (Dec 2025) how a malicious server turns that reverse channel into covert tool calls, conversation hijacking, and compute theft.

TrustFall: project MCP settings turn the folder-trust click into RCE

Adversa AI's TrustFall (May 7, 2026) shows four agentic coding CLIs auto-start project-defined MCP servers the moment a developer accepts the folder-trust prompt — one keypress on the dev machine, zero clicks in CI.

2026-06-02//7 min

Flowise CVE-2026-40933: importing a shared chatflow is enough for RCE

Obsidian Security's May 28, 2026 write-up shows how Flowise's Custom MCP node turns a stdio MCP config into server-side code execution — and how merely importing a shared chatflow can trigger it, no save or run required.

2026-06-01//6 min

CrewAI: a silent sandbox fallback turns prompt injection into RCE (VU#221883)

Four CrewAI flaws let prompt injection chain into RCE, SSRF and file read via a Code Interpreter that silently drops out of Docker. CERT/CC's May 20, 2026 update confirms the full fix.

2026-06-01//6 min

Token-drain attacks: economic denial-of-service via agent tool chains

Two 2026 papers show a malicious tool or skill can steer an LLM agent into long tool-calling loops that multiply token cost 6–658× while still returning the right answer — a stealthy take on OWASP's Unbounded Consumption.

2026-06-01//7 min

SymJack: one approved file copy becomes RCE in six AI coding agents

Adversa AI disclosed on May 26, 2026 a symlink-hijack pattern that turns a single benign-looking shell copy into a config overwrite and host RCE across Claude Code, Cursor, Gemini, Antigravity, Copilot, Grok Build and Codex CLIs.

2026-05-30//6 min

Blindfold: action-level jailbreaks bypass semantic defenses on embodied LLMs

A SenSys '26 paper (May 11–14, 2026) introduces Blindfold, an automated framework that jailbreaks embodied LLMs by decomposing harmful goals into individually benign actions — up to 53% higher attack success than semantic-level baselines on a real 6DoF robotic arm.

2026-05-29//6 min

MemMorph: hijacking tool selection in LLM agents through fluent memory poisoning

A May 24, 2026 arXiv paper from NTU Singapore shows three plausible-looking memory entries can steer an agent toward an attacker-chosen tool with 85.9% success — and survive three off-the-shelf defenses.

2026-05-29//6 min

Microsoft Copilot Cowork: poisoned skills exfiltrate M365 files with no approval

PromptArmor's May 26, 2026 disclosure shows that a five-line prompt injection inside a Copilot Cowork skill file can leak SharePoint and OneDrive documents through auto-approved Teams messages — no patch closes the design.

2026-05-28//7 min

Temporal memory contamination: longitudinal safety drift in memory-equipped LLM agents

Three arXiv papers from April and May 2026 converge on a failure mode complementary to memory poisoning — memory-equipped agents drift unsafe as benign context accumulates, with compressed summaries acting as a laundering channel.

2026-05-28//7 min

The agent harness is your real privilege boundary — and most teams draw it in the wrong place

A May 26, 2026 Pillar Security write-up argues the harness — Claude Code, Cursor, Codex — holds the secrets, tools and hooks an agent never sees. Recent harness bugs and CVE-2026-22708 make the case concrete.

2026-05-28//7 min

Networks of agents break in new ways: Microsoft's red-team, plus RAMPART and Clarity

Microsoft Research red-teamed an internal platform of 100+ always-on agents. Four attack patterns — propagation, amplification, trust capture, proxy chains — show up only at the network level. RAMPART and Clarity, open-sourced May 20, 2026, are the response.

2026-05-27//8 min

Antigravity find_by_name: when a native tool call jumps over Secure Mode

On April 20, 2026, Pillar Security disclosed that a single unsanitised parameter in Google Antigravity's find_by_name tool turned file search into arbitrary code execution — and bypassed the IDE's strictest sandbox.

ClaudeBleed: when a browser agent trusts the wrong extension

LayerX disclosed ClaudeBleed on May 6, 2026: a trust-boundary flaw let any Chrome extension drive Claude in Chrome and exfiltrate Gmail, Drive and GitHub data. The first patch was bypassed within hours.

MCP STDIO transport: the design choice that became 11 CVEs and 200,000 exposed agents

On April 16, 2026, OX Security disclosed that Anthropic's MCP STDIO transport executes any OS command it is handed. Anthropic called it 'by design'. The cascade has produced eleven downstream CVEs in six weeks.

When prompts become shells: prompt injection escalates to RCE in agent frameworks

Two CVEs in Microsoft Semantic Kernel and four in CrewAI — all disclosed in early 2026 — turn a single injected prompt into remote code execution on the host. The pattern is structural, not incidental.

Poison once, exploit forever: persistent memory poisoning of LLM agents (OWASP ASI06)

An April 2026 arXiv paper on cross-site memory poisoning and a May 13, 2026 OWASP post on the Cisco MemoryTrap finding against Claude Code converge on the same lesson: agent memory is a trust boundary.

Treating AI agents like operating systems: a CISPA blueprint for isolation and privilege

A May 14, 2026 CISPA paper applies decades of OS security thinking to LLM agents. Tested on four OpenClaw-like systems, two weakness classes — cross-user exfiltration and unauthorized network egress — fail in every single one.

The Lethal Trifecta: when an agent reads private data, untrusted content, and can phone home

Simon Willison's framework for the single architectural mistake that turned 2026's wave of AI-agent data exfiltration vulnerabilities into a class, not a coincidence.

MCP Back-End Vulnerabilities: classic flaws resurface across AI database bridges

Akamai's May 12, 2026 research found SQL injection (CVE-2025-66335), missing authentication, and unsanitised inputs across three MCP servers — Apache Doris, Apache Pinot, and Alibaba RDS. The pattern, not the bugs, is the story.

Semantic Kernel: when a prompt becomes a shell (CVE-2026-25592, CVE-2026-26030)

Microsoft disclosed two critical vulnerabilities in Semantic Kernel on May 7, 2026 that turn a single injected prompt into host-level code execution. The root cause is architectural: tool registries and eval() treated as features, not security boundaries.

Trust No Tool: cognitive poisoning of LLM agents through tool feedback

A May 17, 2026 arXiv paper introduces 'cognitive poisoning' — a malicious tool that wins the agent's trust over many benign-looking turns and only weaponises the final action. The defence target shifts from prompts to trajectory.

Azure SRE Agent: a multi-tenant token check that let strangers watch your incidents (CVE-2026-32173)

Disclosed April 20, 2026, an Entra ID app-registration misconfiguration on Azure SRE Agent's /agentHub WebSocket let any tenant connect, listen to every prompt, reasoning step, CLI command and credential — silently.

2026-05-25//7 min

CVE-2026-35435: Azure AI Foundry's M365 published agents trusted callers they shouldn't have

Disclosed May 7, 2026 (CVSS 8.6), an improper access-control flaw in Azure AI Foundry let unauthorized attackers elevate privilege through M365 published agents. Microsoft reports active exploitation; mitigations are available before a patch.

2026-05-25//6 min

Claw Chain: four OpenClaw CVEs that turn an AI agent into the attacker's hands

Disclosed May 15, 2026, Cyera Research's Claw Chain chains four patched OpenClaw flaws — sandbox escape, env-var disclosure, MCP loopback EoP, symlink read escape — into full host takeover via the agent itself.

2026-05-25//7 min

Comment and Control: one prompt injection pattern, three vendors leaking GitHub Actions secrets

Disclosed April 15, 2026, Comment and Control turns ordinary PR titles, issue bodies and HTML comments into credential-exfiltration channels in Claude Code, Gemini CLI and GitHub Copilot Agent.

2026-05-25//7 min

PraisonAI CVE-2026-44338: an unauthenticated agent server, exploited in 3h44

Disclosed May 11, 2026, CVE-2026-44338 ships PraisonAI with authentication hard-disabled in its legacy API server. A CVE-Detector scanner hit the endpoint less than four hours later.

2026-05-25//6 min

Localhost agent hijack: cross-origin WebSocket attacks on AI coding agents

CVE-2026-44211 (CVSS 9.7), disclosed May 7, 2026, shows how a single visit to a malicious page can hijack an AI coding agent running on a developer's laptop. The attack class is generic — and architectural.

2026-05-22//7 min