The Prompt Layer Vulnerability: How an Autonomous AI Hacked McKinsey’s Lilli Platform

The Silent Compromise of McKinsey’s Crown Jewel AI Assets

In a stark demonstration of the rapidly evolving cyber threat landscape, an autonomous offensive AI agent developed by CodeWall breached McKinsey & Company’s internal AI platform, Lilli, in just two hours. By exploiting an unauthenticated SQL injection, the machine-speed attacker gained full read and write access to the firm’s highly sensitive proprietary data and underlying system prompts without any human intervention. This unprecedented infiltration exposes a critical oversight in modern enterprise security: the virtually unguarded prompt layer, which fundamentally controls how trusted enterprise AI systems behave and interact with sensitive corporate intelligence.

Points clés

CodeWall’s autonomous offensive agent targeted McKinsey & Company’s internal AI platform, Lilli, which serves over 70% of the firm’s 43,000+ employees.
Within exactly two hours, the AI agent secured full read and write access to the production database without requiring credentials, insider knowledge, or a human-in-the-loop.
The breach was initiated by an unauthenticated SQL injection found in one of twenty-two unprotected API endpoints that reflected JSON keys directly into SQL error messages.
Over 46.5 million plaintext chat messages containing strategic discussions, client engagements, M&A activity, and financials were exposed.
The agent extracted 728,000 vulnerable files, including 192,000 PDFs, 93,000 Excel spreadsheets, and 93,000 PowerPoint decks via direct download URLs.
The hack revealed the entire organizational structure of McKinsey’s internal AI usage by compromising 57,000 user accounts, 384,000 AI assistants, and 94,000 workspaces.
A staggering 3.68 million RAG document chunks, representing decades of proprietary McKinsey research, were discovered laying unprotected in the database.
The most damaging vulnerability was the compromise of Lilli’s prompt layer, exposing the system’s behavioral guardrails to silent manipulation via simple UPDATE statements.
On March 2, 2026, McKinsey’s CISO officially acknowledged the responsible disclosure and fully patched all unauthenticated endpoints to secure the environment.

À retenir

If you think locking the front door to your corporate server room is enough to deter modern hackers, it is probably time to wake up and smell the autonomous AI. To prevent an army of machines from broadcasting your most sensitive strategic conversations, start by verifying that your multimillion-dollar AI infrastructure isn’t vulnerable to a basic SQL injection from the 1990s. Most importantly, start treating your AI instructions like the digital crown jewels they actually are, because right now, your supposedly “unhackable” system is surprisingly open to suggestions from literal robots.

Sources