Two weeks after the Grok Build CLI beta drew attention for its plan-review-approve loop, xAI brought the model to the open API as a public beta on May 29, 2026. Developers who want to evaluate it don’t need the CLI anymore. They can reach it through xAI’s API directly, and reportedly through OpenRouter and Vercel AI Gateway as well, though those platform listings haven’t been independently confirmed in this reporting cycle.
What’s confirmed: the Grok Build program exists and targets agentic software engineering tasks. The API public beta is the next step after the CLI beta covered here in May. Developer testing is active. What isn’t confirmed: the specific technical specifications xAI is claiming for the model.
According to xAI’s announcement materials, the model supports a 256,000-token context window. xAI states it’s optimized for Model Context Protocol server integration and multi-step tool calling. Early API monitoring reportedly recorded average throughput above 100 tokens per second, though that figure hasn’t been independently confirmed. The primary xAI blog post for this release returned a 404 at the time of this report, readers should check x.ai/blog directly for current documentation.
Disputed Claim
Pricing is where the picture gets murkier. xAI reportedly priced the API at $1.00 per million input tokens and $2.00 per million output tokens, consistent with competitive coding model tiers, but this hasn’t been confirmed from an accessible primary source. Developer reviews report the Grok Build CLI requires a SuperGrok Heavy subscription at approximately $299 per month, though that figure also hasn’t been confirmed on xAI’s official pricing page. Don’t build a budget on either number until you’ve verified it directly.
The broader context: Grok Build entered a crowded field. Claude Code, OpenAI’s Codex CLI, and Google’s Gemini-based coding tools are all active in the agentic coding space with varying pricing structures and independent benchmark coverage. What Grok Build 0.1 doesn’t have yet, and what matters most for enterprise evaluation, is independent benchmark data. No SWE-bench results have been confirmed. No Epoch AI evaluation is listed. The model’s performance claims come from xAI alone.
Don’t expect that gap to close quickly. Independent evaluations of coding-specialized models take time, and xAI has historically been selective about third-party access. The throughput claim is interesting if it holds up, 100+ tokens per second is competitive for an agentic coding context, but it requires confirmation before it influences a deployment decision.
What to Watch
What to watch
whether OpenRouter’s model listing for grok-build-0.1 surfaces independent latency and throughput data from aggregate API logs. If xAI’s blog post is restored and confirms the technical specifications, those specs become actionable. The trigger for serious enterprise evaluation is a confirmed SWE-bench score from an independent evaluator, absent that, this is a CLI-to-API expansion worth monitoring, not a deployment decision point.
TJS synthesis: Grok Build 0.1 is on the API. That’s real. The performance and pricing claims around it aren’t confirmed from accessible sources yet, which means the honest recommendation is: access the public beta, run your own workload tests against your actual use case, and hold budget decisions until either xAI restores its documentation or an independent benchmark appears. The agentic coding market rewards tools that can demonstrate performance on real software tasks, not announcement-day claims.