V13 Published June 11, 2026

Agents stopped being theoretical this week.

Sysdig published what they believe is the first confirmed in-the-wild cyberattack run entirely by an AI agent. Meta's own AI recovery chatbot got socially engineered into handing over hundreds of Instagram accounts. And while all of that was happening, the people building the defensive side of this equation published some of the clearest thinking I've seen on which security products survive and what identity primitives need to exist for agents. The gap between "agents are coming" and "agents are here" closed this week.

Let's dig in (no pun intended).

This Week's Signals

Meta's AI Chatbot Handed Hackers the Keys to Instagram

Threat actors compromised hundreds of high-profile Instagram accounts last week by simply asking Meta's AI-powered account recovery chatbot to hand them over. The chatbot, which had API access to account management systems, happily linked attacker email addresses to targeted accounts and sent password reset codes. No exploit kit or zero-day. Just a polite request to a bot with too much authority and no ability to verify intent. The compromised accounts included the Obama White House handle, Sephora, and the Chief Master Sergeant of the Space Force.

Why this matters: This is the clearest real-world example of the "confused deputy" problem in AI agents. The bot authenticated the request (sort of) but never verified authorization. FusionAuth's Dan Moore put it well: the industry is focused on keeping AI from saying bad things while overlooking whether AI should be allowed to do what it's trying to do.

Sysdig Confirms the First Autonomous AI Cyberattack

Sysdig's Threat Research Team published what they call the first confirmed in-the-wild attack where an AI agent autonomously conducted the entire post-exploitation chain. The initial access came through CVE-2026-39987, a pre-auth RCE in Marimo (a Python notebook platform). From there, the attacker handed control to an LLM agent that extracted cloud credentials, called the AWS Secrets Manager API across 11 distinct IPs in 22 seconds using Cloudflare Workers as exit nodes, and pivoted through an SSH bastion to dump a full PostgreSQL database. Total time from initial access to exfiltration: under 60 minutes. The SSH phase alone took less than two minutes.

The data point worth sitting with: A Chinese-language planning comment (translating to "see what else we can do") leaked into the command stream. That's the agent's internal reasoning breaking through into execution. We are watching attackers replace scripts with agents.

Ross Haleliuk: Four Questions That Tell You If Your Security Product Survives AI

Ross Haleliuk published one of the most practical frameworks I've seen for evaluating which security product categories survive the AI transition. It boils down to four questions: (1) Does the product do in-line access or security enforcement? (2) Does solving the problem require runtime data? (3) Does it need a model of a complex system? (4) Can the problem be fixed with a single pull request? If the answer to that last question is yes, AI is going to destroy much of the value the product provides today. Ross argues that products with runtime data, enforcement points, and complex system models will not just survive but get stronger because AI increases the value of proprietary telemetry and in-line control.

Practical takeaway: If you're evaluating vendors or building a security product in 2026, run it through these four questions before you write the check. Categories like EDR, browser security, application proactive security pass.

Identity Is the Agentic AI Problem Nobody Has Solved Yet

Chris Hughes at Resilient Cyber published a deep dive on why the entire IAM stack breaks when agents enter the picture. The core issue: traditional IAM answers "is this human who they claim to be?" Agents are not humans, but they act with autonomy that service accounts never had. Karl McGuinness (former Okta chief product architect) frames it sharply: agents don't need identity passports telling the world who they are; they need authority grants telling the world what they can do. CoSAI's Workstream 4 published nine imperatives for agentic IAM, including eliminating standing privilege entirely. And the IETF OAuth Working Group is processing at least seven agent-related drafts simultaneously, which tells you how unsettled this space still is.

What to watch: CoSAI's three-phase adoption model (visibility, contextual access control, full runtime enforcement) is the most realistic roadmap I've seen. Most enterprises are somewhere between phase one and phase two. The protocol scramble at IETF will likely consolidate in 6 to 12 months, but agents are already deployed and operating with permissions most security teams can't enumerate.

Claude Code Skills Are Now an Attack Surface

Reversec's James Henderson and Trail of Bits both published research the same week showing that Claude Code skills and agent sub-agents are viable initial access vectors, comparable to installing an untrusted pip package. Henderson demonstrated two attack paths: one abuses skill frontmatter to execute shell commands before the LLM even processes the prompt, and the other uses sub-agents with bypassed permission modes to run npm install against a backdoored local registry. Trail of Bits took the next step and tried to bypass every major skill scanner (ClawHub, Cisco's agent scanner, all three on skills.sh). They broke all of them in a few hours using techniques like prepending 100,000 newlines to hide payloads, embedding malicious code in .docx archives and poisoned .pyc bytecode files, and prompt-injecting the guard models.

What to do with this: Trail of Bits' recommendation is blunt: avoid public skill marketplaces entirely. Curate internal skill repositories from trusted sources. The scanners failed because they truncated files, ignored binaries, and let attackers iterate against static targets. If your team is adopting coding agents, treat skill distribution like you treat package management. Because it is.

My Take:
The Same Technology, Both Sides of the Line

Here's what this week made clear: the same AI capabilities that defenders are building into their security stack are now actively being used by attackers in production. The defensive side is moving too. Trail of Bits is building code graph infrastructure that lets AI reason about blast radius and taint propagation. CoSAI published nine imperatives for agentic IAM. The IETF has seven competing drafts for agent authentication protocols. Ross Haleliuk's framework tells us which product categories will matter (runtime data, enforcement, system models) and which won't (anything fixable with a single PR). The pieces exist. They just aren't assembled yet.

What worries me is the asymmetry. Attackers don't need standards, governance frameworks, or enterprise procurement cycles. They need a model, a prompt, and a target. Defenders need all of those things plus organizational buy-in and budget. The median enterprise security team is still debating whether to put agents in a pilot.

I don't know whether the defensive tooling will catch up before the next Sysdig-scale incident hits a Fortune 500. But I know the organizations that are already building agent identity, scoping agent permissions, and instrumenting agent behavior will be the ones that survive it. If you're reading this and your team doesn't have a position on how you govern autonomous agents, this is the week to start. What's your plan? Reply and tell me.

Until Next Week

This was a dense one. The agent era is already here, and it means threat models from six months ago are already stale. If your team hasn't mapped how autonomous agents move through your environment, this is the week to start.

If this changed how you're thinking about agent security, forward it to someone on your team who needs to see it.

One more thing before you go: if you'll be at Black Hat this year, we're co-hosting The CISO Roast with C1 on August 5 in Las Vegas. Invite-only, open bar, and a room full of CISOs talking straight for once. Save the date, seats are limited: Register in Luma

-Amir

One click, let us know how we did.

Login or Subscribe to participate

Thanks for reading The AppSec Signal, DevArmor’s newsletter for security professionals.
Have feedback or ideas for what we should cover next?
Feel free to reach out - [email protected]

Reply

Avatar

or to participate

Keep Reading