1 hack(s).
Shared prefix caching makes LLM APIs faster — and leaks prompts. By timing the first token, an attacker can rebuild another tenant's prompt. A March 2026 paper defends it without killing performance.