Reprompt: one-click Copilot data exfiltration via prefilled-URL prompts
A patched Copilot Personal flaw chained a prefilled-URL prompt, a guardrail that only checked the first request, and server-driven follow-ups into stealthy one-click data exfiltration. The bypass lessons generalise.
What is this?
Reprompt is an attack chain against Microsoft Copilot Personal disclosed by Varonis Threat Labs, with a public technical write-up first published on 14 January 2026 and refreshed on 16 June 2026 alongside a related enterprise finding. It turns a single click on a legitimate-looking Copilot link into a silent data-exfiltration channel: the assistant reads the victim’s session and profile context and leaks it, piece by piece, to an attacker-controlled server. Microsoft confirmed the issue is patched, and the enterprise Microsoft 365 Copilot offering — protected by tenant DLP and Purview governance — was not affected in the same way.
We cover it here not as breaking news but as a clean, fully-remediated case study. Two of its three techniques are bypass patterns that recur far beyond Copilot, and understanding them is more useful than the specific product flaw. No working payload is reproduced below; the value is in the failure modes.
How it works
Reprompt combines three steps. None of them is a flaw in the underlying language model — each is a design-and-validation gap in the surrounding application.
The first is Parameter-to-Prompt (P2P) injection. Copilot accepts a q URL parameter that prefills the prompt box, so a link like https://copilot.microsoft.com/?q=... causes the assistant to process the embedded text as if the user had typed and submitted it. An attacker puts their instructions in that parameter. The user only has to click; the active, already-authenticated session does the rest, with no re-authentication prompt. The same prefilled-prompt affordance has been documented on other assistants — Tenable described it for ChatGPT and LayerX for Perplexity’s Comet browser — which is the tell that this is a class of weakness, not a one-off.
The second is the double-request bypass, and it is the most transferable lesson. Copilot does enforce a guardrail that strips sensitive data out of outbound URL fetches — but, in the vulnerable build, that check ran only on the first execution of an instruction. By framing the request so the assistant performs each action twice and “compares” the results, the attacker moved the data-returning step into a second execution that the filter never inspected. The exact instruction is [REDACTED]; the structural point is that the safety check and the action were not enforced symmetrically across iterations.
The third is chain-request orchestration. Once the first prompt runs, the attacker’s server replies with follow-up instructions that depend on what Copilot returned — username, then location, then a list of recently accessed files, then conversation memory, fragment by fragment. Because every subsequent instruction is delivered by the server after the initial click, inspecting the starting prompt reveals nothing about what is actually being exfiltrated, and client-side monitoring sees only a stream of small, ordinary-looking outbound requests.
Why it matters
The combination is exactly the “lethal trifecta” that the OWASP State of Agentic AI Security and Governance (v2.01, 11 June 2026) puts at the centre of agentic risk: an assistant with access to private data, exposure to attacker-controlled content, and the ability to reach the network can be turned into an exfiltration tool by one injected instruction. Copilot has all three by design, because being helpful means holding the user’s context and being able to fetch things.
The reason the bypasses generalise is architectural. As OWASP and Simon Willison both stress, a language model receives the system prompt, the user request, and any external text as one undifferentiated token stream — there is no reliable boundary between “command” and “data.” A prefilled URL is just another way to slip attacker text into that stream while wearing the user’s authority. And the double-request trick is a reminder that a guardrail applied at one point in the flow is not a guardrail applied to the flow: anywhere enforcement is uneven across retries, regenerations, or follow-ups, there is a bypass.
The chain-request pattern is the part defenders should sit with longest. It defeats the intuitive control — “inspect the prompt before it runs” — because the dangerous instructions do not exist at click time. They are authored dynamically by a remote server in response to the model’s own output. Detection has to move from the prompt to the behaviour.
Defenses
Varonis’s own recommendations, generalised to any LLM assistant or agent that ingests external input:
- Treat every externally supplied string as untrusted data, end to end. Deep links, prefilled prompts, retrieved documents, and tool outputs should never be promoted to “instruction” without an explicit, logged conversion step. Validate prefilled prompt parameters rather than executing them on page load.
- Enforce guardrails symmetrically across all execution paths. Whatever data-loss check runs on the first request must run identically on the second, the retry, the regeneration, and every chained follow-up. Differential enforcement is the bug, not the prompt.
- Make policy server-side and stateful, not client-side and per-prompt. Because chain-request exfiltration is invisible at the starting prompt, controls have to persist across the whole interaction and live where the model acts, not in the browser tab.
- Apply the lethal-trifecta / Agents Rule of Two budget. If an assistant can read private context and reach untrusted content and egress to the network without approval, remove one leg: gate outbound fetches to allow-listed destinations, or require human confirmation before the model retrieves attacker-controllable URLs.
- Hunt on behaviour, not payloads. Alert on assistant-initiated egress to atypical endpoints and on many small, sequential outbound requests from an AI client — the signature of incremental exfiltration — rather than relying on inspecting the prompt the user submitted.
- For users: treat links that open an AI tool with a prompt already filled in the same way you treat any unsolicited link, and read a pre-populated prompt before letting it run.
Status
| Item | Detail |
|---|---|
| Attack | Reprompt (Varonis Threat Labs, researcher Dolev Taler) |
| Target | Microsoft Copilot Personal (consumer) |
| Not affected | Microsoft 365 Copilot (enterprise, tenant DLP/Purview) |
| Techniques | P2P (q-parameter) injection · double-request bypass · chain-request exfiltration |
| Public write-up | 14 January 2026; updated 16 June 2026 |
| Patch status | Patched by Microsoft (January 2026 update cycle); vendor-confirmed remediated |
| In-the-wild | No confirmed mass exploitation reported at disclosure; research demonstration |
| Related | SearchLeak — analogous chain in M365 Copilot Enterprise (Varonis, June 2026) |
| Nature | Defensive case study — no exploit payload reproduced |