Grok by xAI

What Is Grok 4.3? xAI's Flagship Model, Tiers, and the Controversies (2026)

Q: Is the Grok multi-agent system available through the API?

No. The four-agent and 16-agent Heavy architecture is currently a consumer feature; xAI lists the API version as coming soon. API users get a single-model experience, which means xAI's reported hallucination reduction from 12% to 4.2% may not apply to API integrations.

Last verified: June 9, 2026 · Format: Breakdown

4.3

Grok 4.3 Beta is xAI's current flagship, released April 17, 2026

Source: xAI announcements

$1.25

Per 1M input tokens for Grok 4.3 via API ($2.50 output)

Source: xAI API pricing, verified June 2026

Grok 4.3 context window (4.1 Fast and 4.20 reach 2M)

Source: xAI documentation

$8 to $300

Consumer subscription range, from X Premium to SuperGrok Heavy

Source: xAI pricing, verified June 2026

Dec 2025

Knowledge cutoff for Grok 4.3 (real-time X data extends beyond it)

Source: xAI model card

xAI ships models faster than almost anyone can write about them. Grok 4.3 Beta arrived on April 17, 2026, the same day xAI announced it was being acquired by SpaceX. By the time you read this, a Grok 4.4 or 4.5 may already be in the wild, and Grok 5 is reportedly in training on a 10-trillion-parameter target. That pace is the single most important thing to understand about Grok: the version number matters, the marketing does not, and most of the benchmark numbers you will see online describe an older model than the one you are paying for.

This breakdown covers what Grok 4.3 actually is, the full model lineup behind it, what each tier costs, where it performs well, and the safety and bias controversies that any honest assessment has to include. We have flagged xAI-reported figures separately from independent ones throughout, because the gap between the two is often the story.

What Is Grok 4.3

Grok 4.3 Beta is the current flagship large language model from xAI, Elon Musk's AI company. It is the model that the recommended API alias points to and the default that most paying users now hit. It launched on April 17, 2026 with a December 2025 knowledge cutoff, an improved architecture, native video input, document generation for PDFs, spreadsheets and slides, and better tool-calling than the versions before it.

Grok as a product line is built around one structural advantage no competitor can copy: native, real-time access to the X platform (formerly Twitter), which xAI acquired in March 2025. When you ask Grok about something that happened in the last hour, it draws from the live X firehose rather than from stale indexed pages. For breaking-news and social-sentiment work, that is a genuine differentiator. For everything else, it is one model among several frontier options, and not always the strongest.

The "Beta" label is not decoration. xAI ships fast and labels aggressively, and Grok 4.3 sits in a lineup where four named versions coexist and a fifth is in training. Understanding which version you are actually using, and which version a given statistic refers to, is the first practical skill for evaluating Grok. For where it fits in the wider field, see the AI Tools Hub and the Grok sub-hub.

The Model Lineup

Grok is not a single model. It is a family, and xAI keeps several versions live at once with overlapping capabilities and different price points. The headline flagship is Grok 4.3, but the cheapest and the largest-context options sit on different version numbers. Here is the current state of the lineup as of June 2026.

Model	Released	Context	Position
Grok 4.3 Beta	Apr 2026	1M	Current flagship, recommended alias
Grok 4.20 Beta	Feb 2026	2M	Low-hallucination focus
Grok 4.1 / 4.1 Fast	Nov 2025	2M	Cheapest western frontier input
Grok 4	Jul 2025	256K	Original multi-agent model
Grok 3	Feb 2025	retired	Retired May 15, 2026, now redirects to 4.3

Two things stand out. First, the flagship Grok 4.3 has a smaller context window (1M tokens) than the older Grok 4.20 and 4.1 Fast (2M each). If your work depends on feeding in very long documents, the newest model is not automatically the right one. Second, xAI retires versions quickly: Grok 3 was pulled in May 2026 and its traffic now redirects to 4.3. Anything you read citing Grok 3 benchmarks is describing a model you can no longer directly select.

Behind all of this is Grok 5, reported to be in training on the Colossus 2 supercomputer with a 10-trillion-parameter target and a projected arrival in the second or third quarter of 2026. xAI says it trains seven models at once. Treat any Grok 5 specification as a target, not a shipped fact.

Capabilities

Grok's feature set blends a multi-agent reasoning system, real-time data, and a growing set of multimodal and agentic tools. These are the capabilities that distinguish it from ChatGPT, Claude and Gemini in day-to-day use as of June 2026.

Multi-Agent Reasoning

For demanding queries, Grok deploys multiple specialized agents (named Grok, Harper, Benjamin and Lucas) that work in parallel and cross-check each other before producing a final answer. The "Heavy" configuration scales this to 16 agents. xAI reports that this peer-review approach cut its hallucination rate from roughly 12% to 4.2%. That figure is xAI's own, not independently audited, and it applies to the multi-agent consumer experience. One important caveat: this multi-agent architecture is not yet available through the API, where xAI lists it as "coming soon." API users get a single-model experience.

Real-Time X and DeepSearch

Grok reads the live X firehose, which gives it access to breaking events and social sentiment that index-based models cannot reach. DeepSearch builds on this by running iterative, cited research passes across the web and X for multi-source questions. This is the capability that most clearly has no equivalent among competitors.

Multimodal and Generative Tools

Grok 4.3 adds native video input, so you can hand it video content directly for analysis. On the generation side, the Aurora model produces images and Grok Imagine produces short video clips (roughly 10 seconds, 720p HD, with audio). On the document side, the agentic tooling can run Python, browse the web, and generate downloadable PDFs, spreadsheets and slide decks from a conversation.

12% to 4.2%

Hallucination rate reduction xAI attributes to its multi-agent peer-review system (xAI-reported, consumer experience, not API)

Source: xAI, verified June 2026

Grok Model Evolution

Feb 2025

Grok 3 + DeepSearch

Introduced cited research and Think-style reasoning. Retired May 2026.

Jul 2025

Grok 4: Multi-Agent

First multi-agent collaborative model. 256K context. 75% SWE-bench Verified (independent).

Nov 2025

Grok 4.1 / 4.1 Fast

2M context. Grok 4.1 Fast becomes the cheapest western frontier input at $0.20/M.

Feb 2026

Grok 4.20 Beta

2M context, low-hallucination focus. Record 78% on the AA Omniscience benchmark (independent).

Apr 2026

Grok 4.3 Beta

Current flagship. Native video input, document generation, better tool-calling. 1M context.

Tiers and Pricing

Grok is sold three ways: free and paid consumer tiers, a team-oriented Business plan, and a usage-based API. The consumer tiers bundle Grok inside the X platform and the standalone Grok apps; the API is OpenAI-compatible, so migrating an existing integration usually means changing only the base URL and key.

Free

$0 / month

Basic Grok 4 / 4.1 access
About 10 prompts every 2 hours
Limited DeepSearch and image generation
No credit card required

API Pricing (per 1M tokens)

The API is where Grok's cost story is most competitive. Grok 4.1 Fast is the cheapest input among western frontier models, while the flagship 4.3 sits in the middle and the original Grok 4 is the most expensive of the three.

Model	Input	Output	Context
Grok 4.3	$1.25	$2.50	1M
Grok 4.1 Fast	$0.20	$0.50	2M
Grok 4	$3.00	$15.00	256K

Grok 4 cached input is $0.75/M. Prompt caching is automatic, the batch API offers 50% off async workloads, and live search costs $25 per 1,000 sources. Note that the multi-agent architecture is not available via API. API pricing verified June 9, 2026. For the full tier-by-tier breakdown, see our Grok pricing guide.

Benchmarks

Two warnings before any number. First, most public Grok benchmark scores describe Grok 4 or Grok 4.20, not the 4.3 flagship; independent 4.3-specific scores are still thin. Second, several headline figures come from xAI's own evaluations rather than third-party studies. We label each source below.

Read this first: Where a row says "independent," the score comes from an outside evaluator such as Artificial Analysis. Where it says "xAI," the figure is self-reported. Competitor scores in the source data reference Claude 4.7 and an earlier Gemini, so treat cross-model gaps as directional rather than exact.

Grok Benchmarks in Context

AA Omniscience (factual reliability)

Grok 4.20: 78% (independent)

SWE-bench Verified (coding)

Grok 4: 75.0% (independent) Claude 4.7: 87.6%

GPQA Diamond (graduate science)

Grok 4: ~89% (independent) GPT-5.5: 92.8 / Gemini 3.1: 94.3

Humanity's Last Exam

Grok 4 Heavy: 44 to 50% (xAI)

Independent leaders: Grok 4.20 holds the AA Omniscience record at 78%. Claude 4.7 leads SWE-bench Verified at 87.6%. Gemini 3.1 leads GPQA Diamond at 94.3%. xAI-reported: Grok 4 Heavy 44 to 50% on HLE, 15.9% on ARC-AGI-2 (a record at its launch), and Grok 3 Think 93.3% on AIME 2025. Most figures predate Grok 4.3. Sources: Artificial Analysis (independent), xAI (self-reported). Verified June 2026.

The one place Grok genuinely leads is factual reliability. The 78% AA Omniscience score for Grok 4.20 is an independent record, and it measures the model's willingness to admit uncertainty rather than fabricate. For coding and peak reasoning, though, Grok trails: Claude leads SWE-bench Verified and the latest GPT and Gemini score higher on graduate-level science. If you want a current head-to-head, see Grok vs Gemini and Grok vs Claude.

Who Should Use It

Grok 4.3 is a strong fit for a specific set of users and a weak fit for others. The decision usually comes down to whether you need real-time social data, the cheapest possible API input, or maximum factual caution, versus peak coding and reasoning.

Who Gets the Most Value

📡

Social and News Analysts

Native X access gives Grok live trending topics, public sentiment and breaking news that index-based models cannot reach. For real-time social work, it is effectively the only frontier option.

Best fit: SuperGrok / X Premium+

⚙️

Cost-Conscious API Developers

Grok 4.1 Fast at $0.20/M input with a 2M context window is the cheapest western frontier input. The OpenAI-compatible API makes migration straightforward. Note the multi-agent system is not available over the API.

Best fit: API (Grok 4.1 Fast)

🔬

Reliability-Sensitive Teams

If a confidently wrong answer is expensive in your workflow, Grok 4.20's record 78% Omniscience score (independent) and the multi-agent peer review make it worth testing against your own data.

Best fit: SuperGrok Heavy / Business

🏢

Risk-Managed Enterprises

Grok Business at $30/seat offers SOC 2 controls and no-training-on-your-data by default. But the documented content-safety and bias issues below mean a formal risk review should precede any deployment.

Best fit: Grok Business (with review)

The Controversies

An honest "what is" page cannot stop at features and pricing. Grok carries a documented record of safety, content and bias problems that have drawn regulatory attention. We summarize them here factually and proportionately, with the reasons they matter for anyone choosing a tool.

Documented Concerns

Nonconsensual Deepfakes

From late 2025, deliberately thin image-generation guardrails (including a "Spicy" mode built on the Flux.1 model) were reported to enable the creation of nonconsensual sexualized deepfakes of real people, including minors. The reporting prompted broad criticism, an EU regulatory crackdown, and public calls to ban X in some jurisdictions. This is the most serious item on the list and the one most likely to carry legal and reputational risk.

Political Bias and Sycophancy

Independent reviewers have found that Grok tends to mirror Elon Musk's views, leans right on contested political questions, and at times flatters Musk in unrelated responses. System prompts have been modified in ways that affected the model's political stances. For any use where neutrality matters, this is a material concern.

Grokipedia Content Quality

Grokipedia, xAI's AI-generated encyclopedia, has been reported to promote debunked conspiracy theories (including HIV/AIDS denial, vaccine-autism claims, climate denial and race-IQ pseudoscience) and to rely on low-credibility sources. Treat its entries as unverified rather than authoritative.

Factual Reliability Lapses

Despite its strong Omniscience score, Grok still fails some basic logic puzzles and has, on occasion, treated jokes or false stories circulating on X as breaking news. Real-time data is a strength, but it also means the model can amplify whatever is trending, true or not.

None of this erases Grok's genuine strengths, and xAI has reversed or patched several individual incidents after public pressure. But the pattern is the point: the same design choices that make Grok fast, blunt and real-time also make it more prone to these failures than its more cautious competitors. Buyers should weigh that trade-off deliberately rather than assume it away.

Limitations

Beyond the controversies, Grok 4.3 has practical limitations worth knowing before you commit. These are about capability and product gaps rather than conduct.

Practical Limitations

Not the Strongest at Coding or Reasoning

Claude leads SWE-bench Verified at 87.6% versus Grok 4's 75.0%, and the latest GPT and Gemini score higher on GPQA Diamond. Those competitor figures reference Claude 4.7 and an earlier Gemini (the current flagships are Claude Opus 4.8 and Gemini 3.1 Pro), so read the gaps as directional rather than exact. The direction is consistent, though: for peak coding and advanced reasoning, Grok is not the default choice despite its reliability edge.

Multi-Agent Not on the API

The headline four-agent (and 16-agent Heavy) architecture is consumer-only; xAI lists the API version as "coming soon." If you build on the API today, you get a single-model experience, so the hallucination-reduction figures may not apply to your integration.

Heavy Mode Is Expensive

The full 16-agent Heavy experience sits behind the $300/month SuperGrok Heavy tier. Analysts have described it as overkill for most users, roughly ten times the price of SuperGrok without ten times the value for everyday work.

Flagship Has a Smaller Context Window

Grok 4.3 tops out at 1M tokens, while the older Grok 4.20 and 4.1 Fast reach 2M. If long-document context is your priority, the newest model is not the largest, and you may need to mix versions.

Frequently Asked Questions

What is Grok 4.3?

Grok 4.3 Beta is xAI's current flagship large language model, released April 17, 2026 with a December 2025 knowledge cutoff. It adds native video input, document generation for PDFs, spreadsheets and slides, improved tool-calling, and a 1M token context window. It is the model the recommended API alias points to.

How much does Grok cost?

There is a free tier. Paid consumer plans run from X Premium at $8/month and X Premium+ at $40/month (one source lists $50, so confirm on the official page) through SuperGrok at $30/month and SuperGrok Heavy at $300/month. Grok Business is $30/seat/month. Via API, Grok 4.3 costs $1.25 per 1M input tokens and $2.50 output. Prices verified June 9, 2026.

Is Grok 4.3 better than Grok 4.20?

It depends on the task. Grok 4.3 is newer and adds video input, document generation and better tool-calling. But Grok 4.20 has a larger 2M context window (versus 1M for 4.3) and holds the independent AA Omniscience factual-reliability record at 78%. For very long documents or maximum factual caution, the older 4.20 may still be the better pick.

Why is Grok controversial?

Grok has a documented record of content-safety and bias problems: reported nonconsensual sexualized deepfakes (including of minors) that drew an EU regulatory crackdown, political bias that mirrors Elon Musk's views, and a Grokipedia encyclopedia that has promoted debunked conspiracy theories using low-credibility sources. xAI has reversed some incidents after public pressure, but the pattern warrants a formal risk review before enterprise use.

Is the Grok multi-agent system available through the API?

No. The four-agent (and 16-agent Heavy) architecture is currently a consumer feature; xAI lists the API version as "coming soon." API users get a single-model experience, which means xAI's reported hallucination reduction from 12% to 4.2% may not apply to API integrations.

Video Resources

▶

Grok 4.3 Explained: What Changed

YouTube Search

▶

Grok Pricing and Tiers Walkthrough

YouTube Search

▶

Grok Safety and Bias Controversies

YouTube Search

Go Deeper

Resources from across Tech Jacks Solutions

What Is Agentic AI?

Understand the multi-agent architecture behind tools like Grok

Prompt Engineering Library

Prompting techniques that get better results from any AI

FREEAI Governance Charter

Set the principles that govern risky AI deployments

AI Glossary

Definitions for AI terms used in this article

Fact-checked against vendor documentation and official sources, June 2026. Verify current pricing at x.ai before purchasing.

Grok, Grokipedia, Aurora, and xAI are trademarks of xAI Corp. X is a trademark of X Corp. SpaceX and Starlink are trademarks of Space Exploration Technologies Corp. Tesla is a trademark of Tesla, Inc. Claude is a trademark of Anthropic. Gemini is a trademark of Google LLC. ChatGPT is a trademark of OpenAI. This article is editorially independent and not affiliated with or endorsed by xAI.

Gallery

Contacts

What Is Grok 4.3? xAI's Flagship Model, Tiers, and the Controversies (2026)

What Is Grok 4.3

The Model Lineup

Capabilities

Multi-Agent Reasoning

Real-Time X and DeepSearch

Multimodal and Generative Tools

Tiers and Pricing

API Pricing (per 1M tokens)

Benchmarks

Who Should Use It

The Controversies

Limitations

Go Deeper

Services

Learn

Company