← back to categories
MULTIMODAL
(3)3 hack(s).
MULTIMODAL MEDIUM NEW
Sirens' Whisper: inaudible near-ultrasonic jailbreaks of voice LLMs
A March 14, 2026 paper from Huazhong, Tsinghua and Microsoft hides jailbreak prompts in the 17–22 kHz band. Microphone nonlinearity demodulates them back into commands — silent to humans, up to 0.94 non-refusal on commercial voice LLMs.
2026-06-18//7 min
MULTIMODAL MEDIUM
CrossMPI: image-only prompt injection steers what VLMs read and see
A May 15, 2026 Xidian University arXiv paper introduces CrossMPI: imperceptible image perturbations that change how vision-language models interpret both the image and the user's text prompt, with 66% average success across five LVLMs.
2026-05-28//6 min
MULTIMODAL CRITICAL
AudioHijack: imperceptible audio hijacks voice agents (IEEE S&P 2026)
An April 16, 2026 IEEE S&P paper introduces auditory prompt injection: adversarial reverb hidden in audio drives 13 large audio-language models and commercial voice agents (Mistral AI, Microsoft Azure) into unauthorized actions with 79-96% success.
2026-05-26//7 min