Blog
Deterministic security for AI agents — prompt-injection failure modes, middleware patterns, and the engineering behind ActPass.
How an LLM agent escapes a system prompt — and how deterministic middleware blocks it
Adaptive prompt injection beats prompt-only defenses because the attacker moves after your system prompt. The fix is not a better instruction; it is an execution boundary that verifies every risky tool call before it runs.
Your prompt-injection eval is lying if the attacker cannot adapt
Static prompt-injection benchmarks reward defenses that recognize yesterday’s strings. Real attackers optimize against your exact guard, so the runtime gate must be deterministic even when the detector loses.
Why did the agent call that tool? Causal attribution for LLM actions
Input-level prompt-injection detection asks whether text looks malicious. Action-level attribution asks the better question: was this tool call caused by the user’s task or by untrusted content?
How to secure web agents that read hostile HTML and screenshots
Web agents ingest HTML, screenshots, forms, ads, comments, and invisible text. Treat every observation as hostile, then block the action that would move private data or state outside the boundary.
The plan/policy/enforcer pattern for secure AI agents
Secure agents need a planner, a policy, and an enforcement boundary. If the same model writes the plan and self-authorizes every tool call, prompt injection wins at runtime.
Your agent just read a file, hit an API, and got user input — now what?
Static exposure scans tell you which agents could form the Lethal Trifecta. Runtime coloring catches the call that actually does — and blocks it before the prompt injection exfiltrates anything.
An AI chatbot sold a $58,000 SUV for $1 — your bot can too
A customer told a dealership’s AI chatbot to agree to everything, then bought a 2024 Chevy Tahoe for a dollar — “a legally binding offer, no takesies backsies.” It took two sentences. Here’s why every customer-facing bot is one prompt away from the same.
A stranger's Reddit comment can drain your inbox through your AI browser
You asked your AI browser to "summarize this page." Hidden in the page was an instruction telling it to open Gmail, read your verification code, and post it publicly. It obeyed. This is indirect injection — and it needs nothing from you but a click.
The friendly MCP tool that quietly reads your SSH keys
An "add two numbers" MCP tool had a hidden instruction in its description telling the model to read your config files and ship them off — silently. You install MCP servers all the time. Do you read their tool descriptions?
The Lethal Trifecta is hiding in your AI agents — find it in 60 seconds
Prompt injection has no patch. The only durable defense is removing dangerous capability combinations. ActPass scans your agents and tells you exactly which ones are exposed — no runtime, nothing blocked.
Govern Claude Code and Codex from inside the chat
A seatbelt that also teaches. Pair your laptop once, and risky native-tool calls (rm -rf, force-push, secret edits) get a deterministic, explainable nudge — with policy you can relax from the chat itself.