system: OPERATIONAL
← back to all hacks
AGENTS MEDIUM NEW

Agent libOS: make the runtime, not the tool wrapper, the authority boundary

A June 2, 2026 arXiv paper argues most agent frameworks conflate tool visibility with resource authority — and proposes a library-OS runtime where capability checks live at primitive boundaries, not in tool wrappers.

2026-06-19 // 6 min affects: llm-agents, tool-using-agents, coding-agents, agent-runtimes, mcp-servers

In most agent stacks today, the fact that a model can see a write_file tool is the same fact as the runtime having authority to write to disk. A June 2, 2026 paper, Agent libOS (arXiv:2606.03895, cs.OS), argues that this conflation is the structural reason wrapper-level safety keeps failing — and sketches a runtime where the two are deliberately separated.

What is this?

Agent libOS: A Library-OS-Inspired Runtime for Long-Running, Capability-Controlled LLM Agents was posted to arXiv by Yingqi Zhang on June 2, 2026. It is not a new attack and not a new model. It is a runtime design: a library-OS-inspired substrate that sits above a normal host OS and treats each agent as an AgentProcess — a schedulable subject with a process identity, parent-child lineage, a lifecycle, explicit capabilities, typed object memory, human-approval queues, checkpoints, and audit records.

The paper’s central rule is a single sentence: tools are libc-like wrappers; runtime primitives are the authority boundary. In other words, the model-facing tool schema is just a convenient surface, like a C standard-library call. Whether the underlying operation may actually touch a file, an object, the clock, a human, or the tool registry is decided one layer below, at the primitive, under explicit capabilities and policy.

How it works

Agent libOS splits three decisions that a conventional tool registry silently merges. Visibility — can the process see the tool schema at all — is governed by a per-process tool table. Invocation — may the process submit this call — is checked by a broker. Authority — may the resulting operation reach a protected resource — is checked by the primitive manager. The resulting invariant is blunt: tool visibility does not imply resource authority. A process can see write_text_file and still hold write authority to no path at all.

The OS analogy is carried through to lifecycle operations. spawn creates a child with its own namespace and a goal-only memory view, not a copy of the parent transcript. fork attenuates the child’s memory view and budget and, in the prototype, does not inherit the parent’s filesystem-write authority unless it is explicitly granted. exec swaps an agent’s image and tool table but keeps the process id and does not auto-grant the new image’s capabilities, so it cannot escalate. Object memory follows the object-capability tradition: knowing the name of a stored object does not let a process read it without the right capability. None of this is differentiable model behavior — it is plumbing, enforced deterministically.

The author is explicit about scope. The prototype is a Python substrate with async scheduling, namespace-local object memory, runtime-integrated human approval, one-shot permission grants, Deno/TypeScript just-in-time tools behind a syscall broker, and a 123-test regression suite covering containment, revocation, fork/exec attenuation, and “wrapper purity.” It is not kernel-grade isolation, not formal verification, and not a planner that scores higher on task benchmarks.

Why it matters

The threat model is the part worth reading twice. Agent libOS targets exactly the failure modes the field keeps rediscovering: prompt injection that induces a high-risk tool, tool-output injection that changes a later decision, path escape outside a workspace, capability leakage through fork, generated tools that import dangerous APIs, and “confusion between tool-table membership and external-resource authority.” That last one is the lethal trifecta and the confused-deputy problem stated as a systems bug rather than a model bug.

Crucially, the paper does not claim to solve semantic prompt injection. A malicious document — the kind first systematized by Greshake et al. in 2023 — can still persuade the model to request a dangerous action. The claim is narrower and more honest: that request still hits a capability check, a policy, a human approval when required, and an audit record at the primitive boundary. Containers and microVMs protect the host from untrusted code, but, as the paper notes, they do not by themselves decide which in-sandbox action is authorized on behalf of which user. That decision is the gap Agent libOS tries to fill, and it is the gap that benchmarks like AgentDojo keep showing wrapper-level defenses falling through.

Defenses

The design reads as a checklist you can apply without adopting the prototype.

  1. Stop treating the tool registry as an access-control list. Separate “the model can see this tool” from “this operation is authorized.” If your only boundary is a wrapper that calls the host directly, an injected instruction that reaches the planner reaches the host.
  2. Put the policy check at the primitive, not the prompt. File, network, shell, clock, object, and tool-registration operations should each pass through a manager that consults an explicit capability before acting — the same place you would audit, not a confirmation dialog wrapped around a function.
  3. Attenuate authority on fork, spawn, and tool generation. A child or a just-in-time tool should start with less authority than its parent, never the parent’s ambient rights by default. This is the agentic version of dropping privileges.
  4. Make names not equal to capabilities. Discovery and authority are distinct: knowing an object, path, or tool exists must not grant the right to use it.
  5. Make human approval and audit first-class, resumable operations. Approval should be a blocking primitive the scheduler resumes — not a callback bolted onto one demo — and every grant, denial, and side effect should leave a record of which authority allowed it. This pairs naturally with treating tool output as untrusted and with atomicity controls on state-changing calls.

Status

ItemReferenceDateNotes
Paper postedarXiv:2606.03895v12026-06-02cs.OS, CC-BY 4.0, single author (Yingqi Zhang)
ArtifactPython prototype2026-06-02Async scheduler, object memory, syscall broker, JIT tools
Evaluation123 regression tests2026-06-02Containment, revocation, fork/exec attenuation, wrapper purity
Design ruleTools = libc; primitives = authority boundary2026-06-02Visibility ≠ invocation ≠ authority
Explicit non-goalSemantic prompt injection2026-06-02Runtime contains the effect, not the deception
LineageIndirect prompt injection (Greshake et al.)2023-02Why wrapper-level safety is insufficient

Agent libOS will not stop a model from being fooled. What it offers is a place to stand when the model is fooled: a runtime where the dangerous request still has to clear a capability it was never granted. For teams shipping long-running, tool-wielding agents in 2026, the paper’s most useful contribution is the vocabulary — visibility, invocation, authority — for noticing that today’s frameworks usually collapse all three into one.

Sources