AGENTS CRITICAL NEW

Just ask the bot: Meta's AI support assistant and the Instagram takeovers

Over the May 30–31, 2026 weekend, attackers hijacked high-profile Instagram accounts by asking Meta's AI support bot to relink an account email. No prompt injection required — only excessive agency.

2026-06-02 // 6 min affects: meta-ai-support-assistant, instagram

What is this?

Over the weekend of May 30–31, 2026, a series of high-profile Instagram accounts were taken over — including the dormant Obama-era White House handle, the account of U.S. Space Force Chief Master Sergeant John Bentivegna, and the account of security researcher Jane Manchun Wong. Several were briefly defaced with pro-Iranian images. The common thread, first reported by 404 Media on June 1, 2026 and corroborated by TechCrunch and Brian Krebs, was not a software exploit in the classical sense. The attackers simply asked Meta’s AI support assistant to do it.

As Simon Willison put it, the incident “hardly even qualifies as a prompt injection.” There was no jailbreak, no crafted payload, no hidden instruction. The bot was wired to perform a sensitive account action on request, and it did exactly that — for the wrong requester. This is a textbook case of excessive agency (OWASP LLM Top 10), and it is an early, very public example of what happens when an LLM is connected to a privileged customer-support workflow without an authorization layer behind it.

How it works

The root cause is architectural, not a secret string. In March 2026, Meta began rolling out an AI support assistant across Facebook and Instagram, giving it the ability to handle common account-recovery workflows — relinking a lost email address, triggering a password reset, verifying ownership — to reduce the friction of Instagram’s notoriously slow human support. The assistant was granted the capability to mutate account state. What it lacked was a reliable check that the person in the chat was actually the account owner.

According to the public reporting, the flow worked roughly like this:

1. Attacker uses a VPN to appear near the target's usual location
   (to avoid Instagram's automated geo-anomaly checks).
2. Attacker requests account recovery and opens a chat with the
   AI support assistant.
3. Attacker asks the bot to link a NEW email address — the
   attacker's — to the target's account.
4. Bot sends a one-time code to the attacker-controlled email.
5. Attacker relays the code back to the bot, which treats this as
   proof of ownership and surfaces a "Reset Password" action.
6. Attacker sets a new password and locks out the real owner.

The decisive flaw is in step 3–4: at no point did the attacker need control of the email already on the account. The bot accepted a new, unverified email as sufficient to bootstrap recovery, then validated a code it had sent to that same attacker address — a circular trust check. The model behaved helpfully and consistently; the system around it simply had no separation between “a user asking for help” and “an authenticated account owner.”

Why it matters

This incident is significant for three reasons beyond the immediate damage (short, high-value usernames with an alleged resale value north of $500,000, per the Telegram channels that circulated the method).

First, it generalizes. Meta is not unique in wiring conversational AI into account recovery. Ian Goldin, a threat researcher at Lumen’s Black Lotus Labs, told Krebs that “AI chatbots create interesting new attack surface, and we’re likely going to see a lot more of these kinds of attacks.” Any platform that lets an LLM execute privileged account actions inherits this risk.

Second, it collapses the cost of social engineering. Human support agents can also be talked into unauthorized resets, but they are slow, rate-limited, and inconsistent. An always-on, infinitely patient bot that follows the same flow every time turns a craft into a script — one that spread across Telegram in hours.

Third, the model was working as designed. There is nothing to “patch” in the LLM itself. The failure is that a non-deterministic, persuadable component was placed in the trust path for an irreversible, high-impact action. That is an authorization design problem that no amount of prompt hardening fully fixes.

Defenses

The lessons are old security principles applied to a new component. Treat any LLM agent as an untrusted, persuadable actor and put real controls around it.

Enforce authorization outside the model. Sensitive state changes (email relink, password reset, MFA reset) must be gated by deterministic checks in application code — proof of control of the existing recovery factor — never by the model’s judgment of whether a request “seems legitimate.”
Apply least privilege to agents. Give the support bot read and triage capabilities; route any irreversible, account-mutating action through a separate, hardened service with its own verification, or to a human with explicit authorization steps.
Don’t let the requester supply the verification channel. Sending a code to a brand-new, attacker-provided address and accepting that code as proof of ownership is a circular trust check. Recovery codes must go to a pre-existing, verified factor.
Keep humans in the loop for high-impact actions. Irreversible operations should require an out-of-band confirmation step the model cannot satisfy on its own.
Turn on strong MFA — and require it. The attackers reported their method failed against any account with MFA enabled. Even SMS one-time codes blocked it; passkeys or hardware security keys are stronger. Platforms should treat MFA as a hard gate on recovery flows, and users should enable the most robust factor offered.

Status

Item	Reference	Date	Notes
AI support assistant rollout	Multiple reports	2026-03	Bot given password-reset / account-maintenance capability
Method circulated on Telegram	Krebs on Security	2026-05-31	Pro-Iran channels publish video walkthrough
Public reporting	404 Media	2026-06-01	Verified across multiple outlets
Corroboration	TechCrunch	2026-06-01	Confirmed attacker mailbox received the code
Fix acknowledged	Instagram spokesperson Andy Stone	2026-06-01	”Issue is now fixed”; emergency patch over the weekend; no back-end database breach reported

The Meta AI support assistant did not get jailbroken. It was asked, politely, to do something it should never have been allowed to do unauthenticated. As more platforms hand account-recovery flows to conversational agents in 2026, the takeaway is blunt: an LLM is not an authorization boundary, and helpfulness is not identity verification.