CodeWall hacked McKinsey's AI Platform Lilli Through Unprotected API Endpoints

CodeWall’s autonomous AI agent hacked Lilli, McKinsey's internal AI platform, in less than 2 hours. It found 22 unauthenticated API endpoints and gained access to 46.5 million chat messages and 728 000 files. The attack used SQL injection, one of the oldest bug classes in existence. That's the part that should worry you.
____________________________________________________________________________

The Incident

On February 28, 2026, a security company called CodeWall pointed an autonomous offensive AI agent at the internet and let it pick a target. It chose McKinsey & Company’s Lilli, the consulting firm's internal AI platform serving more than 43,000 employees worldwide.

By the time the two-hour operation was over, the agent had full read and write access to Lilli's production database– 46.5 million chat messages, 728,000 files, 57,000 user accounts, and 95 system prompts.

McKinsey's security team received CodeWall's disclosure email on March 1. The CISO acknowledged it the next day and immediately patched the unauthenticated endpoints. Shortly after, McKinsey stated publicly that there was no evidence that client data had been accessed by unauthorized parties.

What was this exotic hack that breached the consulting giant? A regular SQL injection. It was a simple exploit in the 1990s and, apparently, it still works in 2026. All because of 22 unauthenticated API endpoints.

Inside Lilli: What Was at Stake

McKinsey launched Lilli in 2023, naming it after Lillian Dombrowski, the firm's first professional female hire in 1945. By the time of the breach, more than 70% of the firm's employees used it. The platform indexed over 728,000 internal documents, ran retrieval-augmented generation (RAG) over decades of proprietary research, and processed more than 500,000 prompts a month.

What made Lilli a high-value target: The database contained user data and AI configuration which included system prompts, RAG knowledge base, and behavioral guardrails. Anyone with write access to that database could silently change how the AI behaved for all 43,000+ users.

This is what separates an AI platform breach from a conventional database breach. A traditional breach steals data. An AI platform breach like Lilli’s case corrupts the advice that thousands of consultants receive without leaving any trace in application logs.

The consultants get wrong information from their internal (and trusted) tool about strategy, mergers, acquisitions, and client engagements, without even realizing it’s wrong.

The Attack Chain, Step by Step

CodeWall’s agent followed a fairly standard inspection procedure, but at machine speed:

Step 1: Map the attack surface. The agent found that Lilli's API documentation was publicly accessible, exposing more than 200 endpoints. Among those, 22 required no authentication. One of those unauthenticated endpoints accepted user search queries and wrote them to the database.

Step 2: Find the injection point. The search endpoint correctly parameterized its query values, which is a standard protection against SQL injection. But the JSON keys, the field names themselves, were stringed directly into the SQL query. When the agent sent malformed key names, the database reflected them verbatim in its error messages. That reflection told the agent everything it needed: the database was talking back.

In just 15 blind iterations, the agent refined its injections, reading more from each error response until production data started returning.

Step 3: Chain with IDOR. Once inside, the agent combined the SQL injection with an Insecure Direct Object Reference vulnerability. IDOR enabled access to individual user records by ID, exposing each employee’s search history and revealing what McKinsey’s consultants were actively researching.

The combination of these two vulnerabilities gave the agent unrestricted read-and-write database access.

What Got Exposed

Category	Volume
Chat messages	46.5 million
Files (Excel, PPT, PDF, Word)	728,000
User accounts	57,000
System prompts	95
RAG document chunks	3.68 million
AI workspaces	94,000

Those 3.68 million RAG chunks represent McKinsey's intellectual base: proprietary frameworks, internal research, and client-engagement methodologies built over decades. The chunks came with their S3 storage paths and internal metadata intact, so anyone who had extracted them would know exactly where the originals lived.

The 95 system prompts carried the AI's behavioral rules: how to answer questions, which guardrails to apply, and how to cite its sources. They were stored in the same database as everything else.

Because the SQL injection granted write privileges, an attacker could have issued a single UPDATE statement via an HTTP request and changed how Lilli answered every question from that point forward. No code deployment or detectable system change. Just silent drift in how 43,000 employees' AI assistant behaved.

OWASP ZAP, one of the most widely used web application security scanners, did not flag this vulnerability. The SQL injection was in the key names, not the values—and that particular pattern sits outside most scanner rulesets.

Why System Prompt Access Changes the Calculus

In traditional applications, database write access is contained within a defined scope. An attacker who can write to your users' table can create accounts or change passwords. An attacker who can write to your session table can hijack logins. The damage is real but contained.

When an AI platform stores its configuration in a SQL database, write access becomes a different matter. The system prompt defines how the AI behaves in every interaction with a user:

Modify a prompt to suppress certain types of responses and the system silently stops providing them.
Add an instruction to include a URL in every financial recommendation, and the AI will do it reliably for every consultant who asks.
Change the citation behavior and years of accumulated professional trust in the system start pointing in a direction you choose.

None of these changes show up in a traditional audit trail, because no code changed. The application behaves exactly as designed—it reads its instructions from the database and follows them. The instructions just aren't the original ones anymore.

This is not a hypothetical. The McKinsey breach showed that production AI systems at major enterprises are storing behavioral configuration alongside user data in standard relational databases, behind the same authentication (or lack of it) that protects everything else.

What API Intelligence Would Have Caught: The AI Platform Security Gap

The attack left a trail, but no one was looking for it.

The 22 unauthenticated endpoints were not hidden—they were documented in publicly accessible API documentation. Any system tracking authentication status across all observed endpoints would have surfaced them as an immediate gap. Treblle captures authentication context as one of 40+ data points per API request, meaning the authentication status of every endpoint shows up in aggregate analysis from the moment traffic flows through it.

The blind SQL injection itself generated a distinctive pattern: 15 iterative requests to the same endpoint, each with slightly modified key names, each producing database error responses. That sequence—repeated error-generating calls escalating in specificity—doesn't look like legitimate user behavior. Treblle's security scanning covers SQL injection as one of more than 20 threat categories it checks on every request, and the error-response pattern provides a verification signal.

According to Treblle's analysis of over 1 billion API requests per month, unauthenticated endpoints remain one of the most common critical findings across enterprise APIs. The McKinsey case—22 of 200+ endpoints requiring no authentication—fits a pattern Treblle regularly sees at production scale. (Source: Treblle, Anatomy of an API 2025)

IDOR vulnerabilities are hard to detect in real time, while the behavior used to exploit them is easier to identify. Sequential ID-based requests across user accounts produce recognizable access patterns. Monitoring these patterns helps you detect an attack as it happens, instead of waiting for a scanner to find the vulnerability.

The point isn't that monitoring would have prevented the vulnerability from existing. It's that visibility into API behavior in production gives you a second line of defense when static analysis and scanners miss something, as they did here.

The Broader Pattern

This is the third high-profile incident in recent months in which an API security failure in an AI system led to consequences beyond typical data breach outcomes. In each case—the Anthropic distillation attacks, the Moltbook session takeover, and now McKinsey—the technical vulnerability was not novel.

The novelty was what that vulnerability enabled in an AI-specific context. Treblle's security scanning data shows that SQL injection remains one of the top three most frequently detected threat types across monitored APIs, decades after the exploit class was first documented.

Treblle's data across 1 billion monthly API requests shows that enterprises building AI-native systems are deploying them with the same security debt that plagued their pre-AI APIs, plus new configuration surfaces—system prompts, RAG pipelines, model parameters—that traditional security tooling wasn't built to monitor.

The McKinsey breach was an illustration of that gap. The SQL injection was discoverable by any competent security review. What made it dangerous was the target: a production database that also contained the behavioral instructions for a widely used AI system.

If your AI platform stores its configuration in a relational database—and most do—then your AI security posture is only as strong as your database security posture. An old problem with new consequences.

What This Means for Your API Stack

The McKinsey breach will get filed under "enterprise AI security incident" in most post-mortems, but the root cause is older and simpler than that framing suggests. An authenticated production system had 22 endpoints that skipped authentication. One of those endpoints had a SQL injection flaw in a place scanners don't usually check. The database it touched also contained the AI system's behavioral configuration.

If you're building or operating an AI platform and want to see your authentication gaps across all active endpoints, Treblle provides that picture in real time from a single instrumentation point—without requiring separate tooling for API monitoring, security scanning, and governance.
____________________________________________________________________________

The Incident Inside Lilli: What Was at Stake The Attack Chain, Step by Step What Got Exposed Why System Prompt Access Changes the Calculus What API Intelligence Would Have Caught: The AI Platform Security Gap The Broader Pattern What This Means for Your API Stack

Frequently Asked Questions

What is McKinsey's Lilli platform?

Lilli is McKinsey & Company's internal AI platform, launched in 2023 and used by more than 70% of the firm's 43,000+ employees. It provides AI-powered chat, document analysis, and retrieval-augmented generation over a knowledge base of more than 728,000 internal documents, processing over 500,000 prompts per month.

How did the SQL injection in the McKinsey breach work?

The vulnerability was in how the platform's search endpoint processed JSON-formatted queries. While the query values were safely parameterized, the JSON key names—the field identifiers themselves—were concatenated directly into the SQL statement. When the agent sent malformed key names, the database returned error messages containing those keys verbatim, confirming the injection point. The agent then used blind SQL injection techniques across 15 iterations to extract production data from those error responses.

What makes AI platform breaches different from standard data breaches?

When AI systems store their behavioral configuration—system prompts, model parameters, RAG pipeline settings—in the same databases as user data, a standard data breach can also constitute an AI integrity attack. An attacker with write access can alter how the AI behaves for all users without making any code changes, leaving no trace in traditional deployment or code audit logs.

Why didn't standard security scanners catch this vulnerability?

The SQL injection resided in JSON key names rather than query values—a pattern that falls outside the signature sets of most automated scanners, including OWASP ZAP, which was reported to have missed this specific flaw. Traditional scanners look for injection in values because that's where parameterization typically fails. Key-name injection requires the scanner to test field identifiers as injection vectors, which not all tools do by default.

What should teams building AI platforms prioritize after this breach?

Three things matter most. First, all API endpoints should require authentication—22 unauthenticated endpoints in a system handling confidential client data is a governance failure that any API inventory review would surface. Second, AI configuration (system prompts, RAG settings) should be stored separately from user data, with stricter write controls and change logging. Third, production API monitoring should be in place to detect behavioral anomalies—repeated error-generating requests, sequential ID enumeration, unusual endpoint access patterns—before they complete an attack chain.

How quickly did McKinsey respond after disclosure?

[CodeWall](https://codewall.ai/blog/how-we-hacked-mckinseys-ai-platform) sent its disclosure email to McKinsey's security team on March 1, 2026. The CISO acknowledged receipt and requested detailed evidence the following day. Patches for the unauthenticated endpoints were applied shortly after, the development environment was taken offline, and public API documentation was restricted. McKinsey stated there was no evidence that unauthorized parties had accessed client data.