Three data points arrived in the same week. They explain each other.
OpenAI spent approximately $34 billion in FY2025. SemiAnalysis found that a fully-utilized $200/month subscription delivers up to $14,000 in API-equivalent compute. And enterprise teams that shifted to token billing two weeks ago hit costs they hadn’t modeled. Put them together and the picture isn’t a coincidence, it’s a structural dynamic that the subscription pricing model was always going to encounter, and agentic workloads are the catalyst that’s bringing it forward.
What SemiAnalysis Actually Tested
The methodology matters here. SemiAnalysis didn’t run projections from vendor documentation. They bought the plans and ran them, extended coding sessions, agentic tasks, the workloads that run continuously rather than the back-and-forth of consumer chat. Their findings: a fully-utilized $200/month ChatGPT Pro plan delivers the equivalent of up to $14,000 in API-priced tokens. Claude Max at the same price delivers up to $8,000.
The subsidy ratio runs 40 to 70 times the subscription price. For context: the informal rule of thumb in AI pricing circles had been that subscriptions run at roughly 10x API value for average users. SemiAnalysis’s figures, at the high-utilization end, are four to seven times worse than that rule of thumb.
One framing correction before going further: these are API-equivalent figures, not the labs’ actual compute cost. API pricing includes margin. What it costs OpenAI or Anthropic to serve a maximally active subscriber is likely lower than $14,000 in raw infrastructure terms, but the economic exposure is the same direction. Heavy users are expensive. The subsidy is real even if the exact dollar figure is a ceiling rather than a floor.
The Distribution Problem
The flat-fee model has always relied on usage distribution. Light users subsidize heavy ones. The lab prices the plan at a point where the blended cost of all subscribers, some who open the app twice a week, some who run all-day coding sessions, produces acceptable unit economics.
That math held when the heavy users were researchers and developers running occasional exploratory sessions. It held less well as the product improved. It’s breaking now because agentic workloads are qualitatively different from chat workloads.
Chat sessions are bounded. A conversation ends. An agent doesn’t. A coding agent running a large refactor, a research agent processing a document corpus, a workflow agent handling asynchronous tasks, these consume tokens continuously, in the background, often without the user actively watching. The user experience of “I set it and forgot it” maps to “the plan ran at maximum utilization all night.”
This is the structural shift the SemiAnalysis research captures. It’s not that power users have always been expensive, it’s that the definition of a power user just expanded to include anyone running production agentic workflows on a flat-fee plan.
Who This Affects
How Agentic Workloads Changed the Flat-Fee Model
The Enterprise Billing Connection
Two weeks into token billing, enterprise AI teams reported hitting costs they hadn’t budgeted for. That brief documented the buyer’s experience: surprise invoices, usage patterns that didn’t match the models teams had built, procurement teams asking whether token billing was actually cheaper than the flat-fee enterprise plans they’d left.
The SemiAnalysis research is the explanation. If enterprise teams on $200/month consumer plans were subsidized at 40-70x, their usage patterns were never priced at the real cost. When they moved to token billing, where actual consumption is priced, the gap materialized as sticker shock. The shock isn’t a billing error. It’s the first time these teams saw what their agentic workflows actually cost.
Inference costs have been declining significantly, which means the API-equivalent value gap is narrowing over time from the supply side. But it’s narrowing slowly. The SemiAnalysis research reflects current pricing. Inference cost compression over the next 12-18 months will change the arithmetic, but it won’t eliminate the structural tension between flat-fee and consumption-based pricing.
Why the Labs Can’t Just Fix This
The obvious response, raise the $200 price or introduce hard caps, runs into a competitive lock-in problem.
OpenAI and Anthropic are fighting for developer mindshare and enterprise adoption simultaneously. The $200 flat-fee subscription is a competitive instrument, not just a product. It keeps the most sophisticated users, the ones building agentic workflows, the ones recommending tools to their engineering teams, the ones who influence enterprise procurement, on the platform and off the competitor’s.
Cutting the subsidy by raising prices or gating model access risks exactly the users who drive developer ecosystem stickiness. SemiAnalysis infers that the economic pressure may push labs toward restricting their most compute-intensive future models to API access only, the logic being that new models are introduced at API pricing while older, cheaper-to-serve models continue on flat-fee subscriptions. No lab has announced this. It’s an analytical projection. But it’s the most logical structural resolution to the math.
What This Means for OpenAI’s $34B Spend
The OpenAI financial disclosure provides the macro frame. OpenAI’s leaked FY2025 audited financials show approximately $34 billion in total expenditure against $13 billion in revenue. The $19 billion R&D line is the model development cost. The $6 billion sales, marketing, and G&A line is the enterprise growth cost. What the financials don’t break out, and what the SemiAnalysis research helps contextualize, is the ongoing cost of serving existing subscribers at prices set before agentic workloads existed.
What to Watch
Evidence
The subscription subsidy isn’t a line item in the financials. It’s embedded in the cost of revenue. As agentic adoption grows, that embedded cost grows with it, unless the pricing model changes.
What to Watch
Two signals will confirm whether the labs are moving to address the math.
First: any changes to the $200 tier structure. Rate limits introduced, model access tiers created, agentic usage priced separately from conversational usage. These would be direct responses to the SemiAnalysis dynamic, even if the companies don’t say so.
Second: the OpenAI S-1 cost of revenue breakdown. If the S-1 separates subscription cost of revenue from API cost of revenue, investors will be able to see the subsidy directly. If it doesn’t, that’s a disclosure choice worth noting.
TJS synthesis
The $200 flat-fee subscription made sense when the heavy user was a developer running occasional experimental sessions. It doesn’t hold the same way when the heavy user is an enterprise team running production agentic pipelines 24 hours a day. SemiAnalysis quantified the ceiling; the enterprise billing data confirmed the floor is rising. The labs are caught between competitive necessity and unit economics: cutting the subsidy risks losing the power users who drive platform adoption, but absorbing it indefinitely isn’t viable either. The most likely resolution isn’t a price increase, it’s a structural bifurcation where agentic and API workloads move to consumption pricing while casual chat stays flat-fee. Watch for the first lab to announce that split before year-end. The one that does it first sets the new market norm.