neural-bridge.dev
/ AI Security · Working Paper · v0.2 · 8 min read

OWASP for AI: What It Is, How to Use It, Why It Matters

By Andy Herman

If you’ve worked in cybersecurity for more than five minutes, you’ve heard of OWASP. The OWASP Top 10, the list of the most critical web application security risks, has been the de-facto starting point for a generation of security work. Pen test reports cite it. Compliance frameworks reference it. Job descriptions list it.

When AI started shipping in production, OWASP didn’t pretend it was a web app. They built a separate project, a separate list, and they’re already on the second iteration. If you’re going to work in AI security, or even just build with AI, this is the framework you should know first.

What OWASP is, in 60 seconds

The Open Worldwide Application Security Project is a non-profit foundation that publishes free, vendor-neutral security guidance. It’s where the industry centralizes its best knowledge about how things break. The OWASP Top 10 for Web Applications, originally published in 2003 and updated periodically, became the closest thing the field had to a shared vocabulary.

OWASP doesn’t tell you HOW to fix things in detail. It tells you what to worry about and roughly why. Specific implementation is left to teams, books, vendors, and consultants. That looseness is the point. The Top 10 is a starting position, not a checklist.

The OWASP Generative AI Security Project

In 2023, when LLMs began landing in production, OWASP launched a dedicated initiative: the OWASP Generative AI Security Project. It’s grown fast. As of 2025-2026, the project covers:

  • Top 10 for LLM Applications, risks specific to systems that wrap LLMs
  • Top 10 for Agentic Applications, newer, focused on autonomous AI agents
  • AI Security Solutions Landscape, quarterly reports on the vendor ecosystem
  • Adversarial red-teaming methods for GenAI
  • Best practices for governance and oversight

The first list, the OWASP Top 10 for LLM Applications 2025, is what most practitioners encounter first. Let me walk through it.

The 2025 Top 10 for LLM Apps

Reading the Top 10 numerically is the wrong way to absorb it. The list isn’t a priority queue, it’s a taxonomy, and the items make a lot more sense once you see how they cluster. Walk it as a story instead, following content as it moves from input through processing to output, and the geography of the threat surface becomes obvious.

The story starts with input. LLM01: Prompt Injection is the gateway risk and the reason the rest of the list exists. Someone gets adversarial text in front of the model, either by typing it directly or by hiding it inside a document, web page, or tool response the model decides to read, and the model’s behavior pivots to follow the attacker’s instructions instead of yours. Indirect injection (the document case) is harder to catch than direct injection, because the model reads adversarial content the same way it reads anything else. This is the most cited category for a reason. I cover it at depth in Memory Poisoning in Personal Agentic AI Substrates.

Once injection is on the table, the next question is what the attacker can corrupt that survives a single session. That’s the LLM04: Data and Model Poisoning and LLM08: Vector and Embedding Weaknesses cluster. LLM04 is poisoning at the source (training data, fine-tuning sets, and increasingly the wikis and knowledge bases agents compile from raw input). LLM08 is the RAG-specific variant: poisoning the vector index, exploiting embedding similarity to exfiltrate context, or pulling in malicious documents through retrieval. The papers AgentPoison (NeurIPS 2024) and PoisonedRAG (USENIX Security 2025) showed that the data needed to flip an agent’s behavior is shockingly small, often single-digit documents in corpora of millions.

From there the story shifts to what the model gives away. LLM02: Sensitive Information Disclosure got promoted from #6 to #2 in the 2025 update because real incidents kept finding their way into the news, training data leaks, system prompts read back verbatim, RAG contexts bleeding across tenants. Its specialist sibling, LLM07: System Prompt Leakage, is its own category because system prompts have become a product moat: knowing the prompt makes it dramatically easier to plan future attacks or replicate the application’s behavior elsewhere.

What about the model’s output? Two different worries here. LLM05: Improper Output Handling is the trust failure: piping model output straight into a SQL query, a shell command, or a downstream API without treating it as untrusted. LLM09: Misinformation is the truth failure: the model says something confidently wrong, and humans or systems downstream treat the assertion as fact. (LLM09 replaced the older “Overreliance” entry in 2025, which felt a bit too hand-wavy to be actionable.) The two together make the case that LLM output has the same security posture as any other untrusted input.

Then there’s the pair that determines blast radius. LLM06: Excessive Agency is what turns a small compromise into a big one. Give an agent broad tool access, broad permissions, or broad autonomy, and a successful injection at LLM01 doesn’t just produce a bad answer, it sends an email, deletes a file, or moves money. LLM10: Unbounded Consumption is the same pattern aimed at your wallet: prompt-driven cost attacks, runaway token use, infinite tool loops, denial-of-service via your own quota. Together they ask the same question from different angles: what can the adversary make this thing do once it’s been compromised?

Finally, the question of where it all came from. LLM03: Supply Chain got promoted to #3 because it’s where many of the other risks originate. The base model, the embeddings, the third-party plugins, the open-source tools the agent calls — every dependency is attack surface, and “we trusted the upstream” is a common root cause when an LLM02 or LLM04 issue hits production.

That’s the whole list, and the clustering is the actual mental model: input attack (01) → source pollution (04, 08) → disclosure (02, 07) → output trust (05, 09) → amplifier (06, 10) → pedigree (03). If you internalize nothing else, that walk will get you 80% of the way to thinking sensibly about LLM security.

The 2026 Top 10 for Agentic Applications

In December 2024, OWASP launched the Agentic AI Security Initiative to address what was clearly a different shape of problem: autonomous AI systems running multi-step workflows, holding persistent memory, and reaching out to tools on their own. The result, the OWASP Top 10 for Agentic Applications 2026, is a separate list and not a subset of the LLM list.

The reason it had to be separate is that agency changes the threat model. Once an LLM has tools, the classic confused-deputy problem reappears in a new form: indirect injection convinces the agent to invoke a sensitive capability the user never asked for, and the agent’s own permissions are what cause the damage. Once it has long-term memory, you get persistence: a single bad day’s input becomes a permanent belief, which is the entire premise of memory poisoning. Once it can plan multi-step actions, an attacker can subvert its goal without ever touching the system prompt — just nudge intermediate decisions until the agent walks itself somewhere bad. Multi-agent setups make all of this worse, because contamination in one agent leaks through shared memory or message-passing to others. And resource exhaustion stops being a token-count problem and becomes runaway delegation: agents calling agents calling agents, no obvious circuit breaker.

For build-in-public people like me building Neural Bridge, the agentic list is the more relevant one. Most of my Layer 4 and Layer 5 threats in the Neural Bridge security architecture map directly into it.

How to actually use OWASP for AI

There are three increasingly serious ways to use this material, and most teams will end up doing all three.

The lightest use is just vocabulary. Once “LLM05” means “improper output handling” to everyone in the room, conversations get noticeably faster and more honest. Half the value of the Top 10 is replacing fuzzy phrases like “the AI does sketchy things sometimes” with shared shorthand the whole team understands. Don’t underestimate that.

The next step up is self-assessment. Walk each item against your system and answer two questions: are we exposed, and if so, what stops it? The answers will be uneven — most early-stage systems have strong controls in two or three categories and effectively nothing in the rest. That uneven answer is the value of the exercise. A personal project will burn about thirty minutes on this; a small SaaS runs half a day; an enterprise system takes a sprint and turns up things people had been quietly worrying about. A workable template looks like this:

LLM01 — Prompt Injection
  Exposure: [yes/no/partial]
  Vector(s): [where adversarial input could enter]
  Current control: [what stops it]
  Residual risk: [what remains; rate L/M/H]

Run it down the list, write the answers down, and revisit quarterly. The writing down part matters more than it sounds — it’s how you notice that the residual-risk column is identical to last quarter’s, and someone needs to actually do the work.

The heaviest use is regulatory backstop, and this is where OWASP earns its keep in 2026. A growing number of laws now reference “recognized” AI risk frameworks as the basis of reasonable care. The Colorado AI Act explicitly accepts the NIST AI RMF, ISO/IEC 42001, or other internationally recognized frameworks as an affirmative defense, and OWASP fits the shape of “framework an auditor will recognize” even though it isn’t a management system in the ISO sense. If you’re a security team responding to “demonstrate your AI risk program,” you almost certainly want OWASP cited somewhere in the answer. The companion piece, AI Security Regulation in 2026, walks the regulatory side in detail.

What OWASP for AI doesn’t cover well

Honest limitations worth knowing:

  • Deep technical mitigations. The Top 10 tells you to worry about prompt injection. It doesn’t tell you the best filter. That’s left to vendors, papers, and your judgment.
  • Adversarial machine learning specifics. Pre-LLM AI security research (model inversion, membership inference, evasion attacks) is alive and well, mostly outside OWASP’s scope. For that, look to the NIST Adversarial Machine Learning Taxonomy and academic literature.
  • Sector-specific compliance. Healthcare, finance, government: OWASP doesn’t replace HIPAA, DORA, or FedRAMP requirements. It complements them.
  • Operational specifics. Incident response, SOC integration, threat intelligence: OWASP is high-level guidance, not a runbook.

The Top 10 is a starting position. Treat it that way.

How to keep up

OWASP moves fast in AI. Practical sources:

  • OWASP GenAI Security Project blog for official updates
  • The two Top 10 lists, both with version histories on GitHub
  • The Solutions Landscape reports, quarterly
  • Local OWASP chapters often run AI security meetups; useful for both networking and seeing what’s actually getting deployed

If you’re going to focus your reading time, prioritize the LLM Top 10 → the Agentic Top 10 → the Solutions Landscape, in that order.

Further reading

See also

  • 01 - Memory Poisoning in Personal Agentic AI Substrates — companion paper digging into LLM01 / LLM04 / LLM08
  • 01 - AI Security Regulation in 2026 — companion piece on the regulatory side
  • Security Architecture — Neural Bridge threat model, mapped to agentic risks