What Launched
OpenAI confirmed on March 6, 2026 that Codex Security is in research preview. The tool functions as an agentic AI system: it reads a code repository, builds an internal model of what the application does, identifies security vulnerabilities, and generates fix suggestions. It isn’t a static analysis scanner. It doesn’t just pattern-match against a known vulnerability database. It reasons about code function.
Access is available now to ChatGPT Enterprise, Business, and Education subscribers. The first month is free. After that, pricing hasn’t been confirmed publicly at this stage.
The Numbers, and Why They Require Context
AI Business and The Hacker News both report OpenAI’s claimed figures: 1.2 million commits scanned, 792 critical-severity issues identified, 10,561 high-severity problems flagged, and 14 CVEs surfaced across major open-source repositories.
Start with the 14 CVEs. CVEs, Common Vulnerabilities and Exposures, have an external registry. They’re assigned by a network of CVE Numbering Authorities, not self-designated by the company reporting a vulnerability. Legal and technology commentary firm Harper Foley acknowledged these 14 CVEs as real. That matters. It means at least the CVE portion of OpenAI’s benchmark has a verification path that doesn’t end at OpenAI’s press release.
The other numbers don’t have the same standing. “792 critical issues” is a metric OpenAI defines, counts, and reports. What counts as “critical” is a classification decision. Different tools, different methodologies, different thresholds produce different counts from identical codebases. Until an independent security research team runs Codex Security against a published test corpus and compares results, these aggregate figures are data points about OpenAI’s methodology, not benchmarks against an external standard.
None of that means the tool doesn’t work. It means the numbers tell you less than they appear to.
The Agentic Security Context
Codex Security doesn’t arrive in isolation. OpenAI also acquired Promptfoo in the same week, an open-source platform for LLM and agent red-teaming and evaluation. Together, the two moves sketch a pattern: OpenAI is building security infrastructure around its agentic AI ecosystem, not just capability.
That’s a meaningful shift. The dominant story about AI security until recently was external: how do you secure your systems against AI-powered attacks? The emerging story is internal: how do you use AI to secure your own software pipelines? Codex Security is OpenAI’s opening position on the internal question.
For enterprise security teams already using ChatGPT Enterprise, this creates a specific evaluation decision. The tool is available. The first month is free. Running it against a well-understood internal repository, one where your team already knows the vulnerability profile, gives you a ground-truth comparison that vendor benchmarks can’t provide.
What DevSecOps Teams Should Do With the Free Month
Three concrete uses worth prioritizing during the research preview window.
First: run Codex Security against a repository your team has already audited through a traditional SAST tool. Note what it finds that the SAST missed, and what the SAST found that Codex Security missed. That comparison is more valuable than any vendor claim.
Second: evaluate fix quality, not just detection. A tool that finds vulnerabilities but generates fixes that don’t compile, or that introduce new issues, has limited production utility. Detection and remediation are separate capabilities. Test both.
Third: assess the workflow integration. Does the tool fit into your existing CI/CD pipeline? Does it produce output in a format your team can act on without manual translation? Enterprise security tooling lives or dies on integration friction.
The Pricing Question
After the free month, pricing terms are unknown from publicly available sources at this stage. Enterprise AI security tooling is a high-value segment. OpenAI will price accordingly. Teams evaluating Codex Security should use the free window as a genuine technical evaluation, not just exploration, because the decision about whether it earns a budget line will follow shortly after.
The Bottom Line
Codex Security is a real product with a real launch date, real CVE findings, and real access for enterprise subscribers starting now. The benchmark numbers are vendor-reported and need independent validation before they become comparative baselines. The free month is a concrete evaluation opportunity. For ChatGPT Enterprise teams with active security programs, the calculation is simple: run it on something you know, see what it does, and form a judgment before the commercial pricing conversation starts.