March 16, 2026 · Edition #6
Built-in AI Security Is a Sensor, Not a Solution
OpenAI buying Promptfoo is genuinely good news for the industry. Embedding automated red-teaming directly into the platform raises the security floor for every application built on it. More model-layer testing, available to more developers, by default, that's a net positive.
But the story that dominated this week tells a different tale.
The McKinsey breach didn't happen at the model layer. There was no jailbreak. No prompt injection against the LLM. The vulnerability was exposed API documentation, 22 unauthenticated endpoints, and a SQL injection hidden in JSON key names. An AI agent found it in two hours, but a human pentester with a weekend could have done the same. And the most dangerous finding, 95 writable system prompts, sits at the intersection of infrastructure and AI: a configuration management failure that could have silently corrupted every answer the platform delivered.
Promptfoo wouldn't have caught this. No model-layer testing would. The breach happened in the deployment infrastructure, the APIs, the databases, the access controls, the system prompt management.
This is the pattern I keep coming back to: built-in AI security is a welcome addition, but treating it as "done" is like treating EDR as your entire security program. EDR is essential. It's also one sensor in a stack that includes identity, network, posture, data protection, and more. The same is true for model-layer security. It's one signal in a much bigger picture.
Securing AI means seeing across the full stack, model, infrastructure, data, identity, integrations, not just the layer your provider happens to own.