OpenAI’s GPT-5.4 crossed a threshold its predecessors hadn’t. According to OpenAI’s own announcement, GPT-5.4 is “our first general-purpose model with native computer-use capabilities”, meaning the model doesn’t call an external tool to interact with a computer interface. It operates the interface directly. That’s not a marginal upgrade. It’s a different category of system.
The model ships in two variants, Thinking and Pro, and is available via ChatGPT and the API. Access starts at $20 per month on the Plus tier, with Team and Pro tiers above that, according to industry coverage of the release. The same coverage reports a context window of up to 1,050,000 input tokens and 128,000 output tokens, figures that have partial secondary corroboration but haven’t been confirmed directly from OpenAI’s documentation at time of publication.
On benchmarks, the picture is contested. AI industry coverage reports GPT-5.4 Pro leads Artificial Analysis’ Coding and Agentic sub-indices, though Epoch AI’s independent evaluation is pending. The overall Intelligence Index position is less clear: available data from Artificial Analysis shows Gemini 3.1 Pro leading among reasoning models, not a tie. Practitioners making deployment decisions based on benchmark comparisons should treat current figures as directional until Epoch completes its evaluation.
The more consequential question isn’t where GPT-5.4 lands on a leaderboard. It’s what native computer use means for teams building or evaluating agentic systems. Previous architectures routed agent actions through defined function calls, the model requested an action, the host system executed it, results returned. Native computer use collapses that boundary. The model sees a screen and acts on it. The attack surface is different. The authorization model has to be different too.
This isn’t the first time a frontier lab has shipped this capability. Anthropic introduced computer use for Claude in late 2024. GPT-5.4 represents the second major general-purpose model with native computer use, a pattern, not an outlier. The practical implication for enterprises evaluating agentic AI is that computer-use capability is becoming a standard feature of flagship models, not a specialized add-on. Security and integration teams that have been planning for “eventually” are now planning for “now.”
The workforce angle is worth naming plainly. Native computer use means an AI agent can execute sustained, multi-step workflows on a real computer, the same workflows that currently require a human operator. That’s a direct automation path for categories of knowledge work that previous model architectures couldn’t reach. The implications for teams evaluating AI-related workforce changes are covered in depth on the Job Displacement Hub.
One data point to watch: OpenAI hasn’t yet published detailed technical documentation on GPT-5.4’s computer-use architecture, specifically around sandboxing, permission scoping, and what actions the model can and cannot initiate without explicit instruction. Those details matter for enterprise deployment decisions. Until they’re public, architecture choices about how broadly to deploy computer-use capabilities should remain conservative.