SUPPLY CHAIN MEDIUM NEW

When #1 trending is malware: the Open-OSS/privacy-filter Hugging Face typosquat

On May 7, 2026 HiddenLayer found Open-OSS/privacy-filter, a typosquat of OpenAI's model that reached #1 trending on Hugging Face with ~244K downloads in 18 hours before shipping a Rust infostealer.

2026-06-15 // 6 min affects: huggingface-hub, transformers, gguf

What is this?

On May 7, 2026, the HiddenLayer research team disclosed a malicious Hugging Face repository named Open-OSS/privacy-filter. It typosquatted OpenAI’s legitimate openai/privacy-filter release (a PII-redaction model OpenAI unveiled in April 2026), copied the model card nearly verbatim, and shipped a loader.py script that fetched and ran an infostealer on the victim’s machine.

The detail that makes this incident worth studying is not the malware — it is the distribution. Before Hugging Face removed it, the repo reached the #1 trending position with roughly 244,000 downloads and 667 likes in under 18 hours, numbers HiddenLayer assessed were almost certainly inflated by inauthentic accounts to manufacture trust. The Hacker News and CSO Online reported the same finding the following week. This is the social-engineering counterpart to the silent code path we covered in Transformers config-injection RCE: there, loading a model was enough; here, the attacker needed you to run a script — and got a quarter-million downloads anyway.

How it works

Unlike a deserialization bug, this attack required user action: the README told users to clone the repo and run start.bat (Windows) or python loader.py. The kill chain HiddenLayer documented unfolds in stages, and we describe it at the level of defensive indicators, not reproduction:

Decoy loader. loader.py first ran benign-looking model-loading code (a dummy class, fake training output) to appear legitimate.
Dead-drop resolver. It then disabled SSL verification, decoded a Base64 URL pointing at a public JSON paste service (jsonkeeper[.]com), and pulled a command from it. Using a paste service as the command channel let the operator swap payloads without touching the repo.
Hidden PowerShell. The retrieved command ran via PowerShell with -WindowStyle Hidden and CREATE_NO_WINDOW, so nothing was visible. This stage was Windows-only; on Linux/macOS the call simply failed silently.
Second-stage downloader. A batch file (update.bat) self-elevated through a UAC prompt, added Microsoft Defender exclusions, and fetched the final binary from api.eth-fastscan[.]org.
One-shot SYSTEM launcher. A scheduled task impersonating the Edge updater ran the payload as SYSTEM, then deleted itself after two seconds — no persistence, just privileged execution.
Rust infostealer. The ~1 MB final payload ran anti-VM/anti-debug checks, attempted to disable AMSI and ETW, and harvested browser cookies and credentials, Discord tokens, crypto wallets, SSH/FTP keys and screenshots, exfiltrating them as JSON over HTTPS.

HiddenLayer linked six further repositories under the account anthfu reusing the same loader and command URL, and tied the infrastructure to a campaign distributing the Winos 4.0 / ValleyRAT implant — evidence of a broader operation against open-source ecosystems, not a one-off.

Why it matters

The lesson is about trust signals, not a single bug. Download counts, likes, and trending placement are the heuristics practitioners use to judge whether a model is safe — and they are exactly what the attacker forged. The platform’s own discovery surface (the trending list and its recommendation logic) did the targeting work, pushing the malicious repo to the top of everyone’s feed.

This also retires a comfortable assumption: that “a malicious model won’t reach mass distribution.” It reached #1 in 18 hours. Combined with the silent config-injection RCE in Transformers and the recurring GGUF parser RCEs, the picture is clear — a model repository is untrusted code and untrusted data at the same time, whether you from_pretrained() it or follow its README. The same dynamic drives skill-registry abuse we covered in hidden triggers in SKILL.md.

Defenses

Never run setup scripts from a model repo. A legitimate model is weights plus a config; it should not ask you to execute start.bat or python loader.py. Treat any “clone and run” instruction in a model card as a red flag.
Verify the namespace, not the name. Pin to the canonical publisher’s org (openai/..., not Open-OSS/...). Confirm the org, not just a model name that looks right. Typosquatting OpenAI is the entire attack.
Ignore trending and download counts as trust signals. They are gameable and were gamed here. Provenance and signed releases matter; popularity does not.
Load and evaluate models in isolation. Run any first-touch model in a sandboxed container with no host credentials, no standing cloud tokens, and egress controls — so a loader or a malicious config cannot reach ~/.aws, ~/.ssh, or the internet.
Scan before you load. Use model/repo scanning that inspects for executable scripts, suspicious config fields, and known-bad indicators ahead of deployment.
Hunt on the published IOCs. HiddenLayer published domains (api.eth-fastscan[.]org, recargapopular[.]com, jsonkeeper[.]com/b/AVNNE), file hashes, and host artifacts (a scheduled task matching MicrosoftEdgeUpdateTaskCore[a-z0-9]{8}). Block the domains at egress and sweep historically. If a host ran the loader, treat it as fully compromised and reimage rather than clean.

Status

Item	Detail
Disclosed	May 7, 2026 (HiddenLayer); reported widely May 11, 2026
Repository	Open-OSS/privacy-filter (+ 6 `anthfu` repos) — removed by Hugging Face
Reach	#1 trending, ~244K downloads / 667 likes in ~18 hours (likely inflated)
Delivery	`loader.py` / `start.bat` → PowerShell → Rust infostealer (Windows)
Impact	Credential, wallet, Discord and session-cookie theft; SYSTEM execution
Linked activity	Shared infrastructure with a Winos 4.0 / ValleyRAT campaign

The takeaway is not “another bad model got pulled.” It is that the trust heuristics around model hubs — popularity, trending, copied model cards — are an attack surface, and the only durable defenses are provenance verification and isolating model loading from anything worth stealing.