system: OPERATIONAL
← back to categories

INDIRECT INJECTION

(23)

23 hack(s).

INDIRECT INJECTION MEDIUM NEW

Message-object injection: the serialization gap in AI assistants

Imperva showed (June 10, 2026) that contacts, vCards and location pins get flattened inline into an AI assistant's prompt with no untrusted-content boundary — a structural injection vector, patched in OpenClaw 2026.4.23.

2026-06-21//6 min
INDIRECT INJECTION MEDIUM NEW

TRAP: persuasion techniques turn web agents against their own task

An Oxford benchmark updated on arXiv in June 2026 shows web agents obey Cialdini-style persuasion hidden in page elements, abandoning their task in 25% of cases on average and up to 43% for the weakest model.

2026-06-20//6 min
INDIRECT INJECTION MEDIUM NEW

ChatGPhish: untrusted Markdown turns ChatGPT summaries into phishing

Permiso disclosed ChatGPhish on 29 May 2026: a web page you ask ChatGPT to summarize can render attacker links, fake alerts, QR codes and tracking pixels inside the trusted assistant UI.

2026-06-20//6 min
INDIRECT INJECTION MEDIUM NEW

On-device isn't safer: indirect injection hits local and cloud LLMs alike

Brave's June 8, 2026 research shows indirect prompt injection works identically against a cloud browsing agent (Mozilla Tabstack) and an on-device autocomplete (Cotypist) — local hosting is not a mitigation.

2026-06-19//6 min
INDIRECT INJECTION MEDIUM NEW

Error-path injection: when tool error messages carry implicit authority

A June 2026 paper (VATS) shows that injecting instructions inside tool error messages triples indirect-injection success on frontier agents — up to 100% compliance — because models treat error output as authoritative.

2026-06-19//6 min
INDIRECT INJECTION MEDIUM NEW

MIRAGE: mobile GUI agents fooled by injected user-generated content

A May 2026 study shows VLM-driven mobile GUI agents can't tell trusted interface from user-generated content. Realistic text injected into comments and bios hijacks all five tested agents (23–30% success).

2026-06-17//6 min
INDIRECT INJECTION CRITICAL NEW

LogJack: cloud logs as a prompt-injection channel against debugging agents

An April 2026 benchmark shows LLM debugging agents that read cloud logs and run fixes obey instructions hidden in log lines — verbatim command execution up to 86.2%, RCE on 6 of 8 models, and provider guardrails that miss almost everything.

2026-06-17//6 min
INDIRECT INJECTION CRITICAL NEW

Agentjacking: fake Sentry errors hijack AI coding agents via MCP

Tenet Security's June 2026 research shows an attacker can plant a fake Sentry error that AI coding agents read over MCP and execute, exfiltrating credentials with an 85% success rate across 2,388 exposed orgs.

2026-06-16//7 min
INDIRECT INJECTION MEDIUM NEW

Cross-App Context Poisoning: a rogue ChatGPT app can steer the others

A June 2026 arXiv study shows a malicious ChatGPT app can write into the chat context shared by every connected app through first-party APIs, turning the model into a confused deputy against benign apps.

2026-06-16//6 min
INDIRECT INJECTION MEDIUM NEW

Injection depth in ReAct agents: position beats wording

A June 2026 study of tool-calling ReAct agents finds injection depth—not rhetoric—drives indirect prompt injection: success falls from 60% at the first tool call to 0% by the fourth.

2026-06-15//6 min
INDIRECT INJECTION MEDIUM NEW

DACSI: when retrieved documents fake the system's control signals

A June 8, 2026 paper names a quiet RAG failure mode: untrusted document text impersonating metadata, provenance and policy signals. No 'ignore previous instructions' required — the lesson is that document-authored labels are data, not policy.

2026-06-12//6 min
INDIRECT INJECTION MEDIUM NEW

The Injection Paradox: when a prompt injection backfires and erases a brand in RAG

A June 8, 2026 arXiv preprint shows prompt injections in retrieved documents can backfire in safety-trained Claude models, dropping a brand from a 54% to 0% recommendation rate — opening a reverse-attack against competitors.

2026-06-11//6 min
INDIRECT INJECTION MEDIUM NEW

Decision Hijacking: prompt-injecting the LLM that ranks your search results

A growing body of 2025-2026 research shows that when an LLM re-ranks search or RAG candidates, a few injected lines inside one document can force it to the top — collapsing ranking quality by 60+ NDCG points, with stronger models more vulnerable, not less.

2026-06-07//7 min
INDIRECT INJECTION MEDIUM NEW

AgentRedBench: indirect injection in SaaS agents is an authorization gap

AgentRedBench (June 2026) red-teams LLM agents reading from SaaS tools like Gmail and Jira. No-guard attack success ran 32–81% across eight frontier models, until a tool-response classifier cut it.

2026-06-05//7 min
INDIRECT INJECTION MEDIUM NEW

Description poisoning: the agent channel your benchmarks don't test

A May 2026 AWS Bedrock AgentCore demo and a June 2026 arXiv paper converge on the same blind spot: tool descriptions, read before every call, are an injection channel that infra controls and single-number benchmarks both miss.

2026-06-04//6 min
INDIRECT INJECTION MEDIUM NEW

ChatInject: forging chat-template role tags to bypass the instruction hierarchy

An ICLR 2026 paper shows that wrapping an indirect-injection payload in a model's own chat-template tokens forges a higher-priority role, lifting attack success from 5% to 32% on AgentDojo and to 52% with multi-turn.

2026-06-03//7 min
INDIRECT INJECTION MEDIUM NEW

IPI Arena: a 272k-attack competition finds no agent model immune

Gray Swan's Indirect Prompt Injection Arena, judged with UK AISI and US CAISI, ran 272,000+ attacks against 13 frontier models. Every model was hijacked — and a single universal template broke nine of them.

2026-06-02//7 min
INDIRECT INJECTION MEDIUM NEW

Silent Egress: implicit prompt injection leaks data through URL previews

An eBay study (arXiv, Feb 25, 2026) shows agents that auto-preview URLs can be made to exfiltrate runtime context through tool calls — P(egress)≈0.89, and 95% of leaks leave the visible answer benign.

2026-06-02//7 min
INDIRECT INJECTION MEDIUM NEW

IterInject: when an LLM optimiser writes its own indirect prompt injections

A May 23, 2026 paper closes the loop between payload, diagnoser and LLM optimiser — lifting indirect-injection ASR from near-zero to 33–90% on InjecAgent and compromising 5 of 9 Claude Code targets.

2026-05-28//6 min
INDIRECT INJECTION MEDIUM NEW

GrafanaGhost: indirect prompt injection chained with a URL-parse bug to exfiltrate dashboard data

Noma Security's April 7, 2026 disclosure shows how three modest defects — a stored injection point, a startsWith('/') URL check, and a one-word guardrail bypass — combine into a silent exfiltration path through Grafana's AI assistant.

2026-05-28//6 min
INDIRECT INJECTION MEDIUM

Discourse AI XSS (CVE-2026-27740): when LLM output is trusted as HTML

A flagged post, an AI moderator, an htmlSafe call. The Discourse AI plugin treated LLM output as trusted markup, turning indirect prompt injection into Staff-side XSS. Published March 19, 2026.

2026-05-26//6 min
INDIRECT INJECTION MEDIUM

Indirect prompt injection in the wild: three April 2026 studies converge

Google, Forcepoint and CISPA independently measured indirect prompt injection across the open web in April 2026. The picture: 15K+ validated payloads, 32% growth, organized templates.

2026-05-25//7 min
INDIRECT INJECTION MEDIUM

ShareLeak (CVE-2026-21520): the first CVE Microsoft assigned to a Copilot prompt injection

Disclosed April 15, 2026, Capsule Security's ShareLeak write-up details an indirect prompt injection in Microsoft Copilot Studio. Microsoft assigned CVE-2026-21520 (CVSS 7.5) — an unusual industry first that reframes prompt injection as a tracked vulnerability class.

2026-05-25//7 min