V12 Published May 20, 2026

Your security stack is built on assumptions that no longer hold.

There's a pattern forming across a handful of unrelated research drops this week. Chris Hughes published a deep piece on why IAM systems can't govern AI agents. TrustedSec showed that LLMs can reverse engineer your EDR in days instead of weeks. Trail of Bits demonstrated that agentic browsers are replaying XSS and CSRF vulnerabilities the web community spent a decade fixing. And Datadog tore apart the full source code of TeamPCP's Shai-Hulud supply chain framework after it briefly hit GitHub. None of these stories share a codebase or a vendor. They share a root cause: security architectures built on assumptions about who (or what) is on the other end of the connection.

Let’s tune in.

This Week's Signals

IAM Systems Were Never Built to Govern AI Agents

Chris Hughes at Resilient Cyber published a comprehensive analysis of why the entire identity and access management stack breaks down when AI agents enter the picture. The core problem is that IAM was designed to answer "is this human who they claim to be?" Agents aren't humans, but they aren't traditional service accounts either; they reason about which tools to invoke at runtime, spawn sub-agents, and shift behavior with every interaction. Karl McGuinness, former chief product architect at Okta, framed it clearly: agents don't need identity passports telling the world who they are; they need authority grants telling the world what they can do. CoSAI's March 2026 framework proposes nine principles for agentic IAM, starting with treating agents as a new identity primitive and eliminating standing privileges entirely.

Why this matters: If your IAM model still forces a binary choice between "human" and "service account," you have no way to govern agents that authenticate to SaaS APIs, retrieve sensitive data, and take actions with real business consequences at machine speed. The OWASP Non-Human Identity Top 10 catalogs the risks. Agents inherit all of them and add non-deterministic behavior on top.

LLMs Can Now Reverse Engineer Your EDR in Days

TrustedSec's Justin Elze published research showing that across five commercial endpoint products, workflows that previously took skilled reverse engineers weeks now take days with LLM assistance. The models handled mapping, summarization, and cross-version comparison while humans focused on validation. Detection rules, scoring thresholds, allowlists, exclusion paths, and even local ML classifiers shipped by EDR vendors all became inspectable. Elze's conclusion: the assumption that defensive tool internals would remain opaque to attackers is no longer valid.

The takeaway: Elze recommends defenders assume attackers have already reverse-engineered their EDR and know which techniques it misses. His practical response is to layer host hardening (WDAC, ASR rules, LSA Protection), build SIEM detections on raw telemetry instead of EDR verdicts, and lean hard on identity-layer detection where the logic isn't sitting on the endpoint waiting to be studied.

Agentic Browsers Are Replaying XSS and CSRF From a Decade Ago

Trail of Bits published research showing that AI agents embedded in browsers are vulnerable to attacks that are functionally identical to cross-site scripting and cross-site request forgery. The root cause is inadequate isolation: agentic browsers automatically reuse cookies for agent-initiated requests, and LLMs cannot distinguish between data and instructions. The researchers demonstrated exploits ranging from false information injection to full cross-site data exfiltration across multiple agentic browser products. Their threat model identifies four trust zones and four violation classes, and their proposed fix is extending the Same-Origin Policy to AI agents rather than inventing something new.

Source: Trail of Bits

Practical takeaway: The web security community spent years building effective defenses against XSS and CSRF. Those defenses don't carry over automatically when you bolt an AI agent onto the browser. Trail of Bits' recommendation is to extend proven isolation principles to the agent layer instead of hoping alignment training will hold.

TeamPCP's Full Attack Framework Just Leaked on GitHub

Datadog's security research team published a teardown of Shai-Hulud, the complete source code of the TeamPCP offensive framework that was briefly posted to GitHub before being removed. The TypeScript/Bun-based toolkit harvests credentials from over 100 file paths, extracts GitHub Actions Runner memory via /proc/pid/mem, enumerates AWS Secrets Manager and SSM across 17 regions, and exfiltrates data using hybrid encryption. The framework forges complete Sigstore provenance bundles (Fulcio certificates plus Rekor transparency logs), establishes persistence through VSCode tasks and Claude Code SessionStart hooks, and includes a destructive deadman switch (rm -rf ~/) that triggers on GitHub token revocation. 19 of 22 previously documented TeamPCP TTPs matched the codebase.

Shai-Hulud architecture diagram. Source: DataDog.

Why I'm flagging it: This is the first time we've seen the full tooling behind a major supply chain attack group laid bare. It is admittedly a sophisticated system involving forging software supply chain provenance, poisoning npm packages via stolen OIDC tokens, and building persistence into developer tools. If you're running CI/CD pipelines on GitHub Actions, the IOCs in Datadog's report are worth checking against your environment today.

Mythos Scanned curl and Found Mostly Noise

Daniel Stenberg, the maintainer of curl, published a detailed review of what happened when the Mythos AI vulnerability scanner analyzed curl's 178,000 lines of C. Mythos flagged five "confirmed" vulnerabilities. After the curl security team's review, three turned out to be documented API behavior, one was an ordinary bug, and exactly one became a low-severity CVE. This matches what curl has seen from AISLE, Zeropath, and OpenAI Codex Security, which together triggered 200 to 300 bugfixes over the past 8 to 10 months.

AI tools are good at finding established kinds of errors but are not yet finding novel kinds of bugs.

Daniel Stenberg

What to watch: The hype around AI vulnerability scanning is running ahead of the evidence. One real CVE out of five "confirmed" findings is a useful data point for anyone evaluating these tools. The 200+ bugfixes from multiple AI scanners across 8 to 10 months is genuinely impressive for code quality, but the gap between "finds bugs" and "finds novel vulnerabilities" matters when you're pricing these tools against your existing scanners.

Here is what I think
The Assumptions Underneath Your Security Stack Are Breaking

Here's what connects five otherwise unrelated stories this week: every one of them is about a security system that worked correctly under its original design assumptions and then failed when those assumptions changed. IAM assumed identities are either human or machine. EDR assumed its detection logic would stay opaque. Browser security assumed the entity making requests would respect origin boundaries. Supply chain provenance assumed Sigstore attestations couldn't be forged. Vulnerability scanners assumed AI would find what humans miss.

None of these were bad assumptions when they were made. The practical question isn't whether any of these assumptions will eventually get fixed (spoiler: they will). CoSAI's framework gives IAM a roadmap. Trail of Bits' extended SOP proposal gives browsers one. Elze's layered defense recommendations give EDR-dependent teams a bridge. The question is what you do in the gap between the old assumption breaking and the new architecture arriving. For most teams, the answer is the boring one: layer your controls so that no single assumption's failure is catastrophic, and don't bet your security posture on any tool's internals staying hidden.

I don't think we're in a crisis. I think we're in a transition where the abstractions we built security on are getting re-examined under pressure from AI on both sides. The teams that come through it well will be the ones who noticed which assumptions they were depending on before those assumptions broke. If you've done that audit recently, I'd like to hear what you found. Reply and tell me.

Source: Greg Rakozy (Unsplash)

See you at FwdCloudSec Conference?

DevArmor is sponsoring FwdCloudSec in Bellevue on June 1st and 2nd, and we're co-hosting a happy hour on June 1st with the folks at C1. If you're going to be in the area, whether you're attending the conference or just happen to be in the Pacific Northwest, come grab a drink and talk shop. No slides, no pitches, just good conversation with people who care about this stuff. Here is the registration link, would love to see some of you there.

-Amir

One click, let us know how we did.

Login or Subscribe to participate

Thanks for reading The AppSec Signal, DevArmor’s newsletter for security professionals.
Have feedback or ideas for what we should cover next?
Feel free to reach out - [email protected]

Reply

Avatar

or to participate

Keep Reading