system: OPERATIONAL
← back to all hacks
DATA LEAK MEDIUM NEW

Credential leakage in LLM agent skills: a 17,000-skill empirical study

An April 3, 2026 arXiv study analyzed 17,022 agent skills and found 520 leaking credentials — 73.5% of the leaks flow through debug logging that pipes secrets straight into the model's context.

2026-06-12 // 6 min affects: llm-agents, agent-skills, coding-agents, skill-marketplaces

What is this?

On April 3, 2026, a team published Credential Leakage in LLM Agent Skills: A Large-Scale Empirical Study on arXiv — the first large-scale measurement of how third-party “skills” leak the secrets they are trusted with. Skills are packaged bundles of natural-language instructions plus helper scripts that extend an agent’s capabilities, and they typically run in a privileged environment with access to API keys, tokens and database credentials.

The study sampled 17,022 skills (drawn from 170,226 catalogued on the marketplace the authors call SkillsMP, snapshotted in early 2026) and combined static analysis, sandbox execution and manual inspection. It found 520 vulnerable skills carrying 1,708 distinct issues, and derived a taxonomy of 10 leakage patterns — 4 accidental and 6 adversarial. It is a defensive, disclosure-driven piece of research, not an attack toolkit: no working payloads are reproduced here.

How it works

The central finding is that credential leakage in skills is cross-modal. A skill is code and prose, and 76.3% of the leaks the authors found could only be caught by analyzing the script and its natural-language instructions together; a smaller 3.1% arose purely from prompt injection in the instruction text. Tools that scan only source code, or only the prompt, miss most of the surface.

The single largest vector is mundane: debug logging. Plain print and console.log statements accounted for 73.5% of leaks, because a skill’s standard output is piped directly into the agent’s conversational context. A secret written to stdout for debugging is effectively handed to the model — and to anything downstream that can read that context.

Leak pathway (conceptual)

  skill script  --prints secret to stdout-->  LLM context window
                                                     |
                                          logs / memory / transcript
                                                     |
                                       readable by downstream tools

The credentials at stake span the full range: cloud and API keys (AWS, GCP service-account JSON, third-party gsk-/AKIA-style keys), OAuth and platform tokens, database connection strings, SMTP and admin passwords, SSH/TLS private-key material, webhook signing secrets, and even crypto-wallet keys. Crucially, 89.6% of the leaked credentials were exploitable without any special privileges, and the exposure is persistent: secrets removed from an upstream repository survived in independent forks that retained the old material even after the original was fixed.

Why it matters

Skills are becoming the default way to extend agents, and a marketplace model means a developer installs code written by a stranger that then runs with the developer’s own credentials. This study turns a long-suspected risk into measured ground truth, alongside parallel April 2026 work on malicious skills in the wild (last revised June 10, 2026) and on skill-ecosystem supply-chain poisoning.

Two numbers should reshape threat models. First, the stdout-to-context pipeline is a leak channel that traditional secret-scanning never had to worry about — your CI secret scanner does not watch what a tool prints at runtime. Second, forks retain secrets after upstream fixes, so “we rotated and patched the original” is not containment. The same architecture that makes the safety failure — a skill that prints too much — is the one an attacker abuses for a security failure.

Defenses

The authors propose concrete, layered mitigations. For the people who run agents and write skills:

  1. Treat stdout as untrusted output, not a private log. Anything a skill prints can reach the model’s context. Strip recognized credential patterns from the stdout stream before it enters conversational memory, and never debug with raw secrets.
  2. Apply least privilege at the architecture level. Scope and minimize each skill’s credential exposure up front rather than sanitizing after the fact. A skill should hold only the narrowest token it needs, for the shortest time.
  3. Isolate the reasoning engine from the execution engine. The paper argues for capability-based isolation — the LLM and the skill running with separate memory and network access — so a leak in execution does not become a leak in context.
  4. Make pre-publication secret scanning a mandatory gate. Integrate secret scanning into the skill development lifecycle as a required step before publishing, not an optional hardening pass. Note that joint code-plus-language scanning is needed to catch the cross-modal majority.
  5. Rotate, and treat forks as compromised. Because secrets persist across forks, rotate any credential that ever touched a published skill and assume forked copies still carry the old value. Removing it from main is not enough.

Status

ItemReferenceDateNotes
Study publishedarXiv 2604.030702026-04-0317,022 skills analyzed; 520 vulnerable; 1,708 issues
Leakage taxonomyarXiv 2604.030702026-04-0310 patterns (4 accidental, 6 adversarial)
Dominant vectorarXiv 2604.030702026-04-03Debug logging (print/console.log) = 73.5% of leaks
ExploitabilityarXiv 2604.030702026-04-0389.6% exploitable without privileges; persists across forks
Disclosure outcomearXiv 2604.030702026-04Malicious skills removed; 91.6% of hardcoded credentials fixed
Related workarXiv 2602.06547rev. 2026-06-10Malicious agent skills measured across two registries

The takeaway for builders is not “skills are unsafe” but “skills run with your credentials, and your existing secret-management assumptions do not cover them.” The leak channel is the model’s own context window, the fix has to live in skill architecture and the publishing pipeline, and rotation — not deletion — is what actually contains an exposed secret.

Sources