A launch the government shaped
OpenAI is beginning a limited preview of the GPT-5.6 series rather than a general release. In the company’s words, as part of its ongoing engagement with the U.S. government it previewed its plans and the models’ capabilities ahead of launch, and at the government’s request it is “starting with a limited preview for a small group of trusted partners whose participation has been shared with the government, before releasing more broadly.” OpenAI says it plans to make Sol, Terra, and Luna generally available in the coming weeks.
OpenAI is pointed about not wanting the arrangement to stick. It writes that it does not believe “this kind of government access process should become the long-term default,” arguing the restriction “keeps the best tools from users, developers, enterprises, cyber defenders, and global partners who need them.” The company frames the limited preview as a short-term step taken while it works with the Administration to develop a “cyber Executive Order framework and a repeatable process for future model releases.”
Why it was gated: the Preparedness call
The reason for the caution is in the GPT-5.6 Preview System Card. Under OpenAI’s Preparedness Framework, the company is treating Sol, Terra, and Luna as “High capability” in both Cybersecurity and Biological and Chemical risk. None of the three reaches OpenAI’s “High” threshold in AI Self-Improvement, and none reaches “Critical,” the framework’s highest level.
On the cyber side, OpenAI states plainly that the models are “a meaningful step up in cybersecurity capability, but they do not reach our risk framework’s highest level (Critical).” Sol and Terra “can find vulnerabilities and pieces of exploits, but in cybersecurity testing they were unable to carry out autonomous, end-to-end attacks against hardened targets.” In evaluations on Chromium and Firefox, OpenAI says the model identified bugs and exploitation primitives, the building blocks of an exploit, but “did not autonomously produce a functional full-chain exploit under the conditions tested.” The company’s framing is that GPT-5.6 Sol is “better at helping people find and fix vulnerabilities than reliably carrying out end-to-end attacks” — a posture it argues should benefit defenders.
We don't believe this kind of government access process should become the long-term default. It keeps the best tools from users, developers, enterprises, cyber defenders, and global partners who need them.
OpenAI, GPT-5.6 preview announcement
Definition
GPT-5.6 pricing (per 1M tokens)
| Model | Input | Output | Role |
|---|---|---|---|
| Sol | $5 | $30 | Flagship |
| Terra | $2.50 | $15 | Balanced; ~GPT-5.5 performance at ~2x lower cost |
| Luna | $1 | $6 | Fastest, lowest cost |
The System Card also flags an alignment caveat worth noting: in agentic coding evaluations, GPT-5.6 showed “a greater tendency than GPT-5.5 to go beyond the user’s intent,” including taking or attempting actions the user had not asked for, though OpenAI says absolute rates remain low. The card additionally reports an external evaluation by Apollo Research on sandbagging.
The lineup and what’s new
GPT-5.6 introduces a naming system in which the number marks the generation while Sol, Terra, and Luna denote durable capability tiers. Sol is the flagship; Terra is positioned as a balanced everyday model that OpenAI says reaches “competitive performance to GPT-5.5 while being 2x cheaper”; Luna is the fast, lowest-cost option. The release adds a new “max” reasoning effort that gives Sol more time to reason, and an “ultra” mode that uses subagents to accelerate complex work.
What the benchmarks show
OpenAI shared a preview slice of evaluations spanning coding, biology, and cybersecurity. For coding, it reports GPT-5.6 Sol setting “a new state of the art on Terminal-Bench 2.1,” a test of command-line workflows requiring planning, iteration, and tool coordination, with the subagent-powered “ultra” configuration scoring highest. In biology, on GeneBench v1, which evaluates long-horizon genomics and quantitative-biology analyses, OpenAI says Sol “achieves stronger results than GPT-5.5 while using fewer tokens.” In cybersecurity, on ExploitBench it describes Sol as “competitive with Mythos Preview using only ~1/3 of the output tokens,” and on ExploitGym, a benchmark built by UC Berkeley researchers with OpenAI and other frontier labs, it says Sol, Terra, and Luna all improve as reasoning effort increases. OpenAI notes a fuller evaluation suite will accompany the broad release.
The safeguard stack
OpenAI describes a layered safeguard approach rather than a single control: protections trained into the model to refuse prohibited cyber assistance, real-time cyber and biology misuse classifiers that can pause generation for a larger reasoning model to review, account-level review across conversations, differentiated access, monitoring, and enforcement. New with this launch are activation classifiers for Sol and Terra that monitor patterns in the model’s internal activations during inference and pause streaming if those patterns suggest harmful output is coming.
Timeline
What to Watch
On robustness, OpenAI says it “dedicated over 700,000 A100-equivalent GPU hours to automated red teaming” aimed at finding universal jailbreaks, and that automated red-teaming will continue during deployment alongside third-party human expert testing. The company says it maintains a rapid-response process to reproduce, mitigate, and retest newly reported jailbreaks. It also cautions that, especially during the preview, safeguards “may occasionally intervene on legitimate work,” particularly in dual-use security tasks where defensive and offensive activity can look similar at first.
Price and availability
GPT-5.6 is priced per 1M tokens: Sol at $5 input and $30 output, Terra at $2.50 input and $15 output, and Luna at $1 input and $6 output. The release changes prompt caching, with explicit cache breakpoints, a 30-minute minimum cache life, cache writes billed at 1.25x the uncached input rate, and cache reads keeping the 90% cached-input discount. During the preview, the models are available through the API and Codex to a select group of trusted partners and organizations. OpenAI also says it will launch GPT-5.6 Sol on Cerebras at up to 750 tokens per second in July, initially for select customers.
The precedent for enterprises
For compliance and security teams, the signal is the precedent as much as the product. This is the second time a frontier lab’s flagship has been gated by Washington before wide release; news reporting has drawn the parallel to an earlier federal action affecting Anthropic’s Fable 5 and Mythos 5 models. OpenAI’s stated plan to build a “cyber Executive Order framework” and a “repeatable process for future model releases” suggests pre-release government review of high-capability models may be heading toward a standing regime rather than a one-off. Near term, organizations planning to build on GPT-5.6 should expect availability to track government-approved partner status, and should watch whether case-by-case approval hardens into a requirement. As of publication, reporting indicates no government agency had issued an official statement and no specific official was named; the account here is drawn from OpenAI’s own preview post and System Card.