PROMPT INJECTION MEDIUM NEW

Prompt injection in the wild: hidden attacks in LLM resume screening

A USENIX Security 2026 study of 196,682 real resumes found about 1% carry hidden prompt injections — and over 90% are invisible 'data injections', not the explicit instructions current detectors look for.

2026-06-01 // 6 min affects: llm-resume-screening, applicant-tracking-systems, pdf-text-extraction

What is this?

On May 27, 2026, researchers from Duke, UNC, UC Berkeley and the recruiting platform hireEZ posted Measuring Real-World Prompt Injection Attacks in LLM-based Resume Screening (arXiv:2605.28999, to appear at USENIX Security 2026). It is, to the authors’ knowledge, the first large-scale measurement of prompt injection in a deployed LLM application — not a lab demonstration, but a count of how often the attack actually occurs in production.

Prompt injection has been the top item on the OWASP LLM Top 10 since 2023, yet almost all evidence for it has been conceptual or anecdotal. This study fills that gap with data: roughly 1% of ~196,682 real resumes contained hidden instructions or keywords aimed at manipulating automated screening. The threat model is mundane and worth stating plainly: the attacker is a job applicant trying to get their own resume ranked higher, and the payload is invisible to a human reading the PDF.

How it works

The study analysed two de-identified datasets from hireEZ: 83,277 resumes from a candidate-matching product (July 2024 – November 2025) and 113,405 from enterprise applicant-tracking systems (July 2019 – December 2025). A document-aware Hybrid Cascade Detector (rule-based font/color analysis followed by LLM verification) and a Visual Discrepancy Analyzer (a vision-language model comparing the rendered page against the machine-extracted text) flagged hidden content. Both are now running in hireEZ’s production pipeline.

The hiding techniques are old typesetting tricks, not novel exploits, so no payload is reproduced here. Applicants embed text the human eye cannot see but a PDF parser extracts: white text on a white background (color-based), font sizes around 1 pt (size-based), text placed outside the visible page (position-based), or PDF layers that parsers read but renderers do not draw.

The headline finding overturns the research community’s assumptions. Over 90% of detected injections — 90.5% in the recent dataset, 95.7% in the historical one — are not instructions at all. They are data injections: hidden blocks of fabricated skills, keywords and experience meant to game keyword matching and embedding similarity. Explicit “ignore previous instructions” payloads, the kind benchmarks obsess over, are the rare minority.

That distribution explains why existing text-based detectors fail on this surface. The study reports DataSentinel at 87.0% recall but 0.9% precision (it flags nearly everything), while PromptArmor and PromptGuard reach 58.3% and 45.5% precision but collapse to 7.0% and 5.0% recall — because they hunt for instruction patterns that 90% of real attacks simply don’t use. A hidden list of keywords is semantically indistinguishable from legitimate resume text; the only reliable signal is the visual discrepancy between what a human sees and what the machine extracts.

Why it matters

This is the first hard prevalence number for prompt injection in the wild, and it is not negligible: about 1 in 100 resumes, which the authors call a conservative lower bound. The temporal trend is the other tell. The 6.5-year dataset is flat at 0.6–0.8% from 2019 through 2023, then spikes to roughly 1.2% in 2024 — the moment LLM screening became widely known to applicants. Prompt injection here behaves like an emerging social behaviour, not a fixed background rate.

The broader lesson generalises past hiring. A companion benchmark study, AI Security Beyond Core Domains (arXiv:2512.20164, updated April 26, 2026), measured attack success rates above 80% for some injection types against resume-screening prompts and noted that defenses common in mature domains like code review are simply absent in screening, peer review and similar specialised pipelines. Any workflow that feeds an LLM untrusted documents and acts on the output — résumés, invoices, support tickets, scientific submissions — inherits the same exposure.

Defenses

Validate across modalities, not just text. The dominant attack is invisible to text-only filters. Render the document to an image, extract the machine-readable text separately, and flag content that appears in the extraction but not in the human-visible render. This visual-discrepancy check is the single most effective signal the study identifies.
Strip or normalise hidden content before the LLM sees it. Drop sub-threshold font sizes (e.g. below 4 pt), text whose color matches its background, off-page elements and non-rendered PDF layers during ingestion.
Don’t rely on instruction-pattern detectors alone. Tools tuned for “ignore previous instructions” miss the 90%+ of attacks that carry no instruction. Treat them as one layer, not the control.
Prefer training-time defenses where stakes are high. The benchmark study found prompt-based mitigation cut attacks only 10.1% (at a 12.5% false-rejection cost), while a LoRA-tuned Foreign Instruction Detection through Separation approach reached 15.4%, and the two combined reached 26.3% — training-time methods outperformed inference-time prompts on both security and utility. Note that even the best combined reduction is partial; layer defenses, don’t expect one to close the gap.
Keep the model advisory, not decisive. Where an injected resume could change a hiring outcome, the LLM should surface and rank, with a human making the call — and screening logs should record the extracted-vs-visible discrepancy for audit.

Status

Item	Reference	Date	Notes
Measurement study published	Zhang et al., arXiv:2605.28999	2026-05-27	USENIX Security 2026; ~196,682 resumes, ~1% injected
Data injection share	same	2026-05-27	90.5% (recent) / 95.7% (historical) carry no explicit instruction
In-the-wild trend	same	2019–2025	Stable ~0.6–0.8%, spikes to ~1.2% in 2024
Detection comparison	same	2026-05-27	General-purpose detectors fail on data injection
Benchmark + FIDS defense	Mu et al., arXiv:2512.20164	2026-04-26	>80% ASR for some types; combined defense ~26.3% reduction

The takeaway is not that resume screening is uniquely broken — it is that prompt injection has quietly moved from proof-of-concept to a measurable, rising real-world behaviour, and that the detectors built for the textbook version of the attack miss the version people actually use.