OWASP LLM Top 10: What It Means for API Security

The OWASP Top 10 is an awareness document that represents a consensus on the most critical security risks facing web apps today.

We had the OWASP API Security Top 10 and the OWASP Web Application Top 10. Then, in 2023, we got the first OWASP LLM Top 10. It focuses on the ten vulnerabilities that aren't (really) covered in the other OWASP models. This is relevant for API teams because the vulnerabilities manifest through APIs.

See, most LLM applications aren't a standalone model. They're exposed through an API, with clients sending inputs and receiving outputs over HTTP, so their attack surface IS the API layer.

Prompt injection arrives as a request payload. Sensitive information disclosure leaves as a response. Supply chain risks live in the services the API calls downstream.

Understanding the OWASP LLM Top 10 as an API security problem determines where the controls go and what monitoring catches.

What the OWASP LLM Top 10 covers

The ten categories address risks that arise from three distinct properties of LLM applications, where the model:

Accepts natural language input (which is harder to validate than structured input),
Generates natural language output (which can contain anything, including data it shouldn't reveal), and
Behavior emerges from training data and prompting rather than deterministic code (which makes it harder to audit and predict).

The ten categories are:

Prompt Injection (LLM01),
Insecure Output Handling (LLM02),
Training Data Poisoning (LLM03),
Model Denial of Service (LLM04),
Supply Chain Vulnerabilities (LLM05),
Sensitive Information Disclosure (LLM06),
Insecure Plugin Design (LLM07),
Excessive Agency (LLM08),
Overreliance (LLM09),
Model Theft (LLM10).

The categories group into four broad risk types:

Input manipulation risks. Vulnerabilities where an attacker controls the content the model processes: Prompt Injection (LLM01), Insecure Plugin Design (LLM07), Excessive Agency (LLM08). The common thread is that the model can be directed to behave in unintended ways through crafted inputs.

Output risks. Vulnerabilities where the model's response contains something it shouldn't: Sensitive Information Disclosure (LLM06), Insecure Output Handling (LLM02). The model may reveal data from its training, from its context window, or from retrieved documents.

Infrastructure and supply chain risks. Vulnerabilities that exist in the systems surrounding the model: Training Data Poisoning (LLM03), Supply Chain Vulnerabilities (LLM05), Model Theft (LLM10). These are harder to detect at the API layer and often require supply chain auditing rather than runtime controls.

Operational risks. Vulnerabilities that affect availability and reliability: Model Denial of Service (LLM04), Overreliance (LLM09). Model DoS affects API availability; overreliance affects downstream systems and users who trust the model's output uncritically.

Prompt injection at the API layer

Prompt injection is the highest-severity LLM vulnerability and the one most directly visible at the API layer. An attacker constructs an input designed to override or modify the model's instructions, either by embedding a new instruction ("ignore previous instructions and instead do X") or by manipulating context the model treats as authoritative. This happened to McKinsey's internal platform Lilli.

At the API layer, prompt injection arrives as request payload content. The API receives a user-submitted text field, passes it to the model as part of a prompt, and the injected instruction executes within the model's context. The attack doesn't require exploiting a code vulnerability; it exploits the model's fundamental design, which processes instructions embedded in natural language.

The API-layer controls:

Input length and format validation. While validation can't prevent all prompt injection, enforcing schema constraints on input fields (maximum length, character set restrictions for structured fields) limits the complexity of injection payloads. An API that accepts unbounded free text in every field provides the maximum injection surface.

Prompt architecture. Separating system instructions from user input in the prompt structure reduces injection risk. Passing system instructions as privileged parameters makes them harder to override. Including them directly with user input makes them easier to override.

Output validation. Check model responses for signs of instruction override. References to internal system prompts or responses that begin with "I will ignore previous instructions" suggest that prompt injection may have succeeded.

Full payload capture. Detecting prompt injection requires visibility into both requests and responses. Treblle's Real-Time Request Explorer captures complete request and response payloads for every LLM-serving endpoint. This visibility helps teams identify injection patterns after they occur and create detection rules from real traffic.

Treblle's Automated Threat Scanning inspects request payloads before they reach the model layer. It compares each payload against known attack-pattern libraries, including emerging prompt-injection signatures.

Sensitive information disclosure through LLM responses

Sensitive information disclosure (LLM06) occurs when a model reveals data it shouldn't: training data, system prompt contents, information from the context window, or data retrieved from connected knowledge sources.

At the API layer, sensitive information disclosure is an output problem: the response payload contains data that shouldn't reach the consumer. The API is the exit point.

Three mechanisms drive LLM-specific sensitive information disclosure:

Training data memorization. LLMs trained on datasets containing PII, credentials, or confidential information can reproduce that data verbatim when prompted in specific ways. A model trained on GitHub commits that include API keys may reproduce those keys when asked about similar patterns.

Context window leakage. In retrieval-augmented generation (RAG) systems, the context window contains retrieved documents that may include sensitive information. A model can leak retrieved documents when it fails to distinguish between context content and response content.

System prompt extraction. Models can sometimes be prompted to reveal their system instructions, which may contain confidential business logic, internal classifications, or operational details that weren't intended to be visible to end users.

The API-layer response to sensitive information disclosure is output scanning: checking response payloads for patterns that indicate sensitive data before the response reaches the consumer. Treblle's Sensitive Data Masking detects and strips PII, credentials, and secrets from response payloads. It operates at the API layer where the response exits, regardless of whether the sensitive data originated from training data, context retrieval, or application logic.

Sensitive information disclosure in LLM APIs is structurally different from data leakage in standard APIs. In a standard API, a developer controls what data the response serializer includes. In an LLM API, the model generates the response content and can include things the developer never anticipated exposing. Output scanning is the only reliable control.

Supply chain risks in AI-powered APIs

Supply chain vulnerabilities (LLM05) cover risks introduced by the external components an LLM application depends on: the model itself (obtained from a third-party provider or downloaded from a model hub), the training data used to fine-tune or adapt the model, the plugins and tools the model can call, and the retrieval infrastructure that provides the model's knowledge context.

At the API layer, supply chain risks are harder to detect at runtime because they're embedded in the application's construction rather than in individual requests. The controls are primarily design-time:

Model provenance verification. Confirming that models obtained from third-party sources haven't been tampered with between training and deployment. For models downloaded from model hubs (Hugging Face, Ollama), this means verifying checksums and reviewing the model's training and fine-tuning history before deploying.

Plugin and tool surface area. Every plugin or function tool the model can invoke is an extension of the attack surface. A model that can call an internal database, send emails, and execute code has a far larger blast radius for prompt injection than a model that can only return text. Auditing and minimizing the tool surface is a design-time control.

Retrieval source auditing. In RAG systems, the quality and integrity of the retrieval corpus directly affects the model's outputs. A retrieval source that an attacker can write to (a shared knowledge base, an editable wiki) is a vector for indirect prompt injection: the attacker doesn't need to manipulate the request, only the documents the model retrieves.

For teams evaluating their AI API readiness, including supply chain posture, Treblle's AI Readiness Score assesses the structural properties of AI-powered APIs: parameter descriptions, schema completeness, operation IDs, and response examples that make the API's behavior auditable and predictable.

How standard API security monitoring applies to LLM traffic

Several OWASP LLM Top 10 categories aren't unique to LLM applications. They're standard API security risks that manifest in specific ways when the API calls a model.

Model Denial of Service (LLM04) is functionally equivalent to API abuse: a consumer sends requests designed to consume disproportionate computational resources, whether by sending extremely long inputs, crafting prompts that trigger long inference chains, or exploiting model behaviors that generate unusually large outputs. The countermeasure is the same as for standard API DoS: rate limiting per consumer, input length limits, and response size caps. An API without rate limiting on its LLM-calling endpoints is as exposed as any other unrate-limited API, but the resource cost per request is higher.

Insecure Output Handling (LLM02) occurs when model output is passed to downstream systems without validation or sanitization. An LLM that generates HTML, SQL, or shell commands that are then executed without sanitization creates injection vulnerabilities downstream: SQL injection from model-generated queries, XSS from model-generated HTML. The vulnerability is standard injection; the novelty is that the injection payload originates from the model rather than directly from the user.

Excessive Agency (LLM08) occurs when an LLM is given capabilities beyond what the use case requires: the ability to read and write files, send network requests, or execute code when the use case only requires generating text. The principle of least privilege applies as directly to model capabilities as it does to API permissions.

For all three, the controls are the same controls that apply to any API: rate limiting, input validation, output sanitization, and minimal surface area. The API security best practices pillar and the OWASP API Security Top 10 cover the detection and prevention framework; LLM applications require the same controls plus the model-specific layers above.

What changes when your API is an LLM wrapper

An LLM wrapper API accepts user input, passes it to a model, and returns the model's output with minimal transformation on either side. Most consumer-facing AI applications follow this pattern, and it creates a specific security profile.

Authentication matters more. An unauthenticated LLM wrapper endpoint is both an API vulnerability and a cost vulnerability: every unauthenticated request triggers model inference, which has a real compute cost. Authentication on LLM endpoints isn't optional.

Rate limiting matters more. The compute cost per request for LLM inference is orders of magnitude higher than for most API endpoints. An LLM endpoint without rate limiting is a cost center with unlimited exposure.

Payload inspection matters more. The request payload is the attack vector. Standard API security monitors response codes and headers; LLM API security requires monitoring payload content: both request inputs (for injection patterns) and response outputs (for sensitive data leakage).

Output is non-deterministic. The same request can produce different responses on different calls. Testing-based security assurance is insufficient: you can test a specific input and confirm the output is acceptable, but the same input may produce a different output in production. Continuous monitoring of production responses is the only reliable control for non-deterministic output.

Treblle's full-payload capture and Automated Threat Scanning apply directly to LLM wrapper APIs: every request-response pair is logged in full and scanned against threat patterns. This continuous monitoring compensates for non-deterministic model behavior.

For the broader context of AI-powered API governance and documentation requirements, the AI governance for APIs article covers the governance layer.

To start scanning LLM API traffic against threat patterns and masking sensitive data in model responses, Treblle connects from a single SDK integration.

4 ways how Treblle helps

Automated Threat Scanning scans every request payload against 20+ threat categories, including emerging prompt injection signatures. It applies at the API layer before the payload reaches the model and operates directly on full request content.

Sensitive Data Masking detects and strips PII, credentials, and secrets from request and response payloads before storage. Applied to LLM response outputs, it catches sensitive information disclosure at the exit point, regardless of whether the sensitive data originated from the model's training, its context window, or application logic.

Real-Time Request Explorer captures complete request and response payloads for LLM-serving endpoints. It provides the forensic record needed to identify prompt injection patterns retrospectively, investigate sensitive data incidents, and build detection rules from observed production payloads.

AI Readiness Score evaluates AI-powered API structural properties: parameter descriptions, schema completeness, operation IDs, and response examples. It surfaces design-time gaps, including insufficient parameter documentation and missing response schemas, that make API behavior harder to audit and predict.

The 2025 API Security Checklist

Stay ahead of emerging threats with our 2025 API Security Checklist.

Download Ebook

What the OWASP LLM Top 10 covers Prompt injection at the API layer Sensitive information disclosure through LLM responses Supply chain risks in AI-powered APIs How standard API security monitoring applies to LLM traffic What changes when your API is an LLM wrapper 4 ways how Treblle helps

Frequently Asked Questions

What is the OWASP LLM Top 10?

The OWASP LLM Top 10 is a list of the ten highest-risk vulnerability categories in applications built on large language models, published by the Open Web Application Security Project in 2023. The ten categories are: Prompt Injection (LLM01), Insecure Output Handling (LLM02), Training Data Poisoning (LLM03), Model Denial of Service (LLM04), Supply Chain Vulnerabilities (LLM05), Sensitive Information Disclosure (LLM06), Insecure Plugin Design (LLM07), Excessive Agency (LLM08), Overreliance (LLM09), and Model Theft (LLM10). Each maps to risks that arise from how LLMs process natural language input, generate natural language output, and interact with surrounding infrastructure.

What is prompt injection in LLM APIs?

Prompt injection is an attack where an adversary embeds instructions in user-controlled input that override or modify the LLM's intended behavior. In an API context, the injection arrives as request payload content: a user-submitted text field passed to the model as part of a prompt. The injected instruction executes within the model's context because the model processes instructions embedded in natural language. API-layer controls include input length and format validation, architectural separation of system instructions from user input, output validation for injection indicators, and full payload capture for retrospective detection.

How does the OWASP LLM Top 10 differ from the OWASP API Security Top 10?

The OWASP API Security Top 10 covers vulnerabilities in API design and implementation: broken authentication, excessive data exposure, lack of rate limiting, and injection vulnerabilities in structured inputs. The OWASP LLM Top 10 covers vulnerabilities that arise from the specific properties of LLM applications: natural language input (which is harder to validate than structured input), non-deterministic output (which can contain anything, including data that should remain private), and emergent behavior from training and prompting. Several LLM Top 10 categories (denial of service, output injection) overlap with API Security Top 10 categories; others (prompt injection, sensitive information disclosure from model memory, training data poisoning) are specific to LLM applications.

How do you secure an LLM wrapper API?

An LLM wrapper API, one that accepts user input, passes it to a model, and returns the model's output, requires several controls. Every endpoint needs authentication, since unauthenticated requests create both security and cost exposure. Per-consumer rate limiting is essential because LLM inference costs are high and unlimited requests create unbounded cost. The API also requires input scanning for prompt injection patterns, output scanning for sensitive data disclosure, and full payload capture for monitoring and incident response. Because LLM output is non-deterministic, testing-based security assurance must be supplemented by continuous production monitoring.

What is sensitive information disclosure in LLM applications?

Sensitive information disclosure (LLM06) occurs when an LLM reveals data it shouldn't, including memorized training data (PII or credentials present in the training corpus), system prompt contents (revealing internal instructions or business logic), or context window contents (documents retrieved for RAG that contain sensitive information). At the API layer, this manifests as response payloads that contain sensitive data. The control is output scanning: checking response content for sensitive data patterns before the response is returned to the consumer.

OWASP LLM Top 10: What It Means for API Security

What the OWASP LLM Top 10 covers

Prompt injection at the API layer

Sensitive information disclosure through LLM responses

Supply chain risks in AI-powered APIs

How standard API security monitoring applies to LLM traffic

What changes when your API is an LLM wrapper

4 ways how Treblle helps

Frequently Asked Questions

Related Articles

API Testing: A Practical Guide for Backend Teams

Shift Left Security: Applying It to API Development

API Security Posture Management: What It Is and Why It Matters