PRAC: hijacking a computer-use agent's choice through its attention
An April 2026 Tübingen paper shows one imperceptibly perturbed product image can concentrate a computer-use agent's visual attention and steer 82% of its selections — without ever touching the output.
What is this?
Computer-use agents (CUAs) increasingly act on a graphical interface on a user’s behalf — browsing the web, filling forms, and making purchasing decisions. They are built on vision-language models (VLMs) that “look” at the screen and decide what to click. A paper by Dominik Seip and Matthias Hein of the Tübingen AI Center (University of Tübingen), published on arXiv as arXiv:2604.08005, introduces PRAC — Preference Redirection via Attention Concentration — an attack that quietly steers which option such an agent picks.
The distinctive idea is that PRAC does not try to corrupt the model’s output, the way a prompt injection or a malicious pop-up would. It manipulates the model’s internal preferences by “redirecting its attention toward a stealthy adversarial patch.” In a shopping case study, a single perturbed product image makes the agent disproportionately “see” — and therefore select — the attacker’s product, even though the image still shows the real product and the perturbation is barely perceptible to a human.
How it works
PRAC targets the attention scores inside the language-model decoder rather than the grounding coordinates or the selection string the agent emits. Conceptually, the adversarial product image is optimised so that, across the model’s layers, it “attract[s] disproportionally high attention scores” relative to the other images in context — the paper frames the objective as maximising the share of visual attention landing on the target image. When the agent reaches the moment of choice, that image dominates what it attends to, and it is selected.
Because the manipulation lives in internal attention and not in the produced text or action, it is highly transferable: it does not need to optimise for a fixed target output or a known grid position. The perturbation is constrained to a small ‖δ‖∞ ≤ 8/255 budget — “small enough that humans either do not notice it at all or perceive it as a low-quality image at most.” No instruction text is injected, so the page itself stays trustworthy.
The realistic threat model is what makes this notable. The attacker is modelled as a malicious third-party seller who “can manipulate the product image on the website but has no control over the website itself,” cannot choose where their product lands in the grid, and cannot fix the agent’s output. One important constraint cuts the other way: the authors “assume to have white-box access to the CUA” (black-box only for fine-tuned variants), and they note this access requirement as a current limitation.
Why it matters
Tested against four open-weight VLM agents — Qwen3-VL-8B, GLM-4.6V-Flash, Kimi-VL-A3B, and EvoCUA-8B — PRAC reaches a mean selection-success rate of 82.3%, versus a 20.8% clean baseline (with five products, random choice is ~20%), and ”≥ 15% higher selection than the next-best baseline.” It transfers to fine-tuned descendants of those models with only a 0–40% drop, because “susceptibility to our attack is inherited from the base architecture.”
This is an integrity attack on agent decision-making, not a data leak — but its consequences are commercial and adversarial. It quietly turns “which product did the agent buy?” into something a third party can bias, and the same selection-steering generalises to “any task requiring autonomous selection of the CUA based on visual information.” It sits alongside earlier visual threats to CUAs such as adversarial pop-ups, but evades the text-centric defenses those prompted.
Defenses
The uncomfortable finding is that the usual guardrails miss this entirely, because the agent’s behaviour stays inside “expected user interactions” and its output is benign:
- Don’t rely on output/guard filters. Models that “monitor and filter model outputs for security violations are ineffective” here — there is no malicious string to catch. Input prompt-injection filters also miss it, since nothing textual is injected.
- Prompt-level defenses are not enough. The authors tested Instruction Hierarchy and a Reflection Prompt; both were “deem[ed] ineffective in defending against our attack,” with PRAC still succeeding 58–97% of the time depending on model.
- Treat visual inputs as adversarial. The realistic path forward the paper points to is model-level robustness — “adversarial training or other techniques” — so that base VLMs used as CUAs resist attention manipulation rather than patching it downstream.
- Add non-visual selection checks. Where a CUA makes consequential choices (purchases, approvals), gate them on structured, out-of-band data (price, seller reputation, product IDs) rather than the rendered image alone, and keep a human in the loop for high-value actions.
- Constrain the trust you place in single sellers’ assets. A perturbed image from one uncontrolled third party should not be able to dominate a selection; diversify the signals that drive the decision.
Status
| Item | Reference | Notes |
|---|---|---|
| Paper | arXiv:2604.08005 | Seip & Hein, Tübingen AI Center |
| Attack | PRAC — preference redirection via attention concentration | Targets decoder attention, not output |
| Models tested | Qwen3-VL-8B, GLM-4.6V-Flash, Kimi-VL-A3B, EvoCUA-8B | Mean SSR 82.3% vs 20.8% clean |
| Threat model | Single perturbed product image, ‖δ‖∞ ≤ 8/255, white-box | Black-box for fine-tuned variants |
| Code | ”published latest when the paper gets accepted” | Not yet released at time of writing |
The takeaway: PRAC is a reminder that an agent’s attention is an attack surface, not just its prompt or its output. As long as a perturbed image can dominate what a VLM agent attends to, defenses that only inspect text or outputs will not see the manipulation coming — and the durable fix lives in the model’s robustness, not in a downstream filter.