trust_remote_code=False isn't a boundary: vLLM's recurring model-load RCE
CVE-2026-27893 (disclosed March 27, 2026) is vLLM's third trust_remote_code bypass. Two model files hardcode trust_remote_code=True, silently overriding an operator's opt-out and enabling RCE from a malicious model repo.
In brief vLLM’s
--trust-remote-code=Falseis supposed to stop a model repository from running arbitrary Python on your inference host. CVE-2026-27893, disclosed March 27, 2026 (CVSS 8.8), is the third time that boundary has been bypassed — this time because two model files hardcodetrust_remote_code=True. It affects vLLM 0.10.1 through 0.17.x; the fix landed in 0.18.0. The lesson is not one bug but a pattern: per-file opt-in is not a trust boundary.
What is this?
CVE-2026-27893 is a protection-mechanism failure (CWE-693) in vLLM, the inference and serving engine behind a large share of production LLM deployments. Operators can pass --trust-remote-code=False to refuse running custom Python shipped inside a model repository. The GitHub Security Advisory, published March 27, 2026, shows that two model implementation files ignore that setting and pass a literal trust_remote_code=True to Hugging Face Transformers — so the remote code runs anyway.
What makes this worth covering is not the single CVE but its lineage. The advisory itself names two predecessors: CVE-2025-66448 (December 1, 2025, the auto_map config-loading path) and CVE-2026-22807 (the broader auto_map startup path). Each was patched; each time the same trust boundary fell again through a different code path. CVE-2026-27893 is the third in that sequence.
How it works
The operator’s choice is propagated correctly through vLLM’s config hierarchy as self.config.model_config.trust_remote_code. The two vulnerable call sites simply don’t read it. Per the advisory, the offending lines are:
# vllm/model_executor/models/nemotron_vl.py (vision encoder load)
AutoModel.from_config(config, trust_remote_code=True)
# vllm/model_executor/models/kimi_k25.py (image processor load)
cached_get_image_processor(model_name, trust_remote_code=True)
Because the literal True overrides the global setting, Hugging Face Transformers downloads and executes Python from the referenced repository at model-load time, with the privileges of the vLLM process. The trigger path: an attacker publishes a model repo targeting the Nemotron-VL or Kimi-K25 architecture; an operator loads it — believing --trust-remote-code=False protects them; vLLM dispatches into one of these two files; the hardcoded True wins; the repo’s code runs. The bypass is silent — no warning, no log entry signals that the operator’s setting was overridden. CVSS rates it 8.8 (AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H): network-reachable, no attacker privileges, but the operator must initiate the load (UI:R).
Why it matters
The recurrence is the story. vLLM delegates model-specific behaviour to individual files in model_executor/models/, and each file must independently honour the global trust_remote_code flag. There is no central chokepoint that prevents a file from hardcoding True. That means every new model implementation is a fresh opportunity to reintroduce the same class — which is exactly what happened three times across config.py, the auto_map startup path, and now two model files.
For defenders, the practical danger is a false sense of safety. Teams that adopted --trust-remote-code=False as a control may be running models through an affected path with no indication the control is inert. And the broader lesson generalises well beyond vLLM: any framework that scatters enforcement of a security-critical setting across many independent files, without centralised mediation, is structurally prone to this failure. As of disclosure there was no public proof-of-concept and no evidence of in-the-wild exploitation, but the affected versions span a year of releases (0.10.1–0.17.x).
Defenses
- Upgrade to vLLM 0.18.0 or later. The fix in PR #36192 replaces the hardcoded
Truewithself.config.model_config.trust_remote_codeat both call sites. Check your version withpip show vllm. - Don’t treat
trust_remote_code=Falseas your only boundary. Run inference in a minimal container with restricted egress, isolate the serving tier from sensitive data stores, and segment it to limit lateral movement if the process is compromised. - Verify model provenance. Restrict loading to trusted, signed publishers; checksum or sign model artifacts before loading rather than relying on a runtime flag — particularly for Nemotron-VL and Kimi-K25 architecture models.
- Scan your own builds for the pattern. Because this class has recurred, grep
model_executor/models/*.pyfortrust_remote_code=Trueliterals (and similarfrom_pretrained/from_configcalls that don’t propagate config) in any custom or forked vLLM. Wire that check into CI so a future model file can’t reintroduce it. - Watch for the silent override. In environments where models are pre-cached, an outbound
.pyfetch fromhuggingface.coduring load — or an unexpected child process under a vLLM worker — is a useful hunting signal that remote code ran when it shouldn’t have.
Status
| Item | Reference | Date | Notes |
|---|---|---|---|
| CVE-2026-27893 advisory | GitHub (GHSA-7972-pg2x-xr59) | 2026-03-27 | Hardcoded trust_remote_code=True, CVSS 8.8, CWE-693 |
| Affected versions | vLLM | 0.10.1 → 0.17.x | Nemotron-VL and Kimi-K25 load paths |
| Patched version | vLLM 0.18.0 | — | Fix PR #36192 |
| CVE-2025-66448 | GitHub (GHSA-8fr4-5q9j-m8gm) | 2025-12-01 | First bypass, auto_map in config path |
| CVE-2026-22807 | GitHub advisory | 2026 | Second bypass, broader auto_map startup path |
The right framing isn’t “patch one CVE.” It’s that trust_remote_code in vLLM has been a leaky boundary three times over, and a serving stack that relies on it as a hard control should add isolation and provenance checks that don’t depend on a single per-file flag being set correctly.