NVIDIA Cosmos 3 Checkpoints Are Live: The Technical Report, License Terms, and What Developers Can Build Now

June 5, 2026 3 min read NVIDIA Research Partial Moderate

Tech Jacks Solutions AI News Coverage

NVIDIA has moved Cosmos 3 from announcement to deployment: the full technical report is published, open-weight checkpoints are live on Hugging Face and GitHub, and the licensing is designed to permit commercial use, which means the question for developers shifts from "what is this?" to "can we build on it?"

generative-ai-news ai-tools-news ai-models-news open-source-ai-news nvidia cosmos-3 physical-ai robotics world-model open-weights

Cosmos 3 Super parameters, 64B

Key Takeaways

Cosmos 3 open-weight checkpoints are now live on Hugging Face and GitHub, the full technical report is published and the model is deployable in two sizes: Super (64B total: 32B Reasoner + 32B Generator) and Nano (8B for workstation-grade hardware).
The MoT architecture integrates vision-language reasoning, image generation, audio-visual generation, and robot policy into a single omnimodal model, confirmed directly from
NVIDIA's research page.
Licensing is described as OpenMDW-1.1 under a Linux Foundation framework permitting commercial customization, but terms require direct confirmation against official license documentation before commercial deployment.
Performance rankings (#1 open-source on Artificial Analysis and RoboArena) are vendor-stated from the technical report; no independent evaluation has been published.

Model Release

Cosmos 3 (Super / Nano)

OrganizationNVIDIA

TypeWorld Model

ParametersSuper: 64B total (32B Reasoner + 32B Generator) | Nano: 8B

Benchmark[SELF-REPORTED] Artificial Analysis: #1 open-source (text-to-image/image-to-video) | RoboArena: #1 open-source (policy models), vendor-stated; no independent verification available

AvailabilityOpen weights, Hugging Face (nvidia/cosmos3) + GitHub (github.com/nvidia/cosmos)

Verification

Partial NVIDIA Research page (primary, text confirmed); Hugging Face collection (secondary, URL resolves) MoT architecture confirmed from primary source text; license terms (OpenMDW-1.1/Linux Foundation) and benchmark rankings are vendor-stated; independent evaluation pending

Checkpoints are live. That changes the conversation.

When NVIDIA announced Cosmos 3 on June 3, the story was architectural: a Mixture-of-Transformers model that integrates vision-language reasoning, image generation, audio-visual generation, robot policy, forward dynamics, and inverse dynamics into a single omnimodal world model. No chaining multiple specialized models. One forward pass across modalities. The NVIDIA Research page confirms this directly, “Cosmos 3 connects understanding, generation, simulation, and action through a shared omnimodal world model that moves fluidly across text, images, video, audio, and actions.”

Now the weights are downloadable. The code is at github.com/nvidia/cosmos. The collection is on Hugging Face. That’s the difference between a research announcement and a deployment decision.

Cosmos 3 ships in two sizes. The Super configuration runs 64 billion total parameters, a 32B Reasoner and 32B Generator working in tandem. The Nano is 8 billion parameters, sized for workstation-grade hardware. For robotics teams running inference on edge nodes, Nano is the relevant variant. For teams building simulation pipelines or training policy models in the cloud, Super is the configuration to evaluate.

Architecture approach: Cosmos 3 MoT vs. prior pipeline

Cosmos 3 (MoT unified)

Single forward pass, text, image, video, audio, action in one model

Prior pipeline approach

Chained specialized models, VLM + video gen + policy model run separately

The licensing matters and requires careful reading. NVIDIA describes the release as using the OpenMDW-1.1 License, which it characterizes as a Linux Foundation framework designed to permit commercial customization. That framing is plausible, the Linux Foundation’s Open Model and Dataset Weights initiative is a known licensing framework. But the specific terms haven’t been confirmed from independent documentation in this reporting cycle. Verify the license directly before building commercial workflows on top of Cosmos 3 weights.

The part nobody mentions in the announcement materials: NVIDIA’s performance rankings (first among open-source models on Artificial Analysis for text-to-image and image-to-video, first on RoboArena for policy benchmarks) are vendor-stated claims from the technical report. Independent evaluation hasn’t been published. Don’t make deployment decisions based on those rankings until a third party reproduces them. The model’s architecture is confirmed and credible; the benchmark position is not.

Where this lands in the competitive picture: Google’s Gemma 4 12B local agentic stack (covered by TJS earlier this week) targets a different problem, local, on-device agentic workflows with language capability. Cosmos 3 targets physical world understanding and robot policy. These aren’t competing for the same use case. But enterprise teams evaluating “open-source AI infrastructure” as a category need to distinguish between them clearly.

What to Watch

First independent Cosmos 3 evaluation on Artificial Analysis or equivalent4-8 weeks

Third-party RoboArena policy model benchmark reproduction4-8 weeks

Official OpenMDW-1.1 license documentation confirmationImmediate, verify before commercial deployment

What to watch

The first independent benchmark reproduction of Cosmos 3 on Artificial Analysis or a comparable platform will either confirm or complicate NVIDIA’s rankings claim. That’s the inflection point. For robotics teams, the RoboArena position matters most – watch for third-party policy model evaluations over the next four to eight weeks.

The TJS read: Cosmos 3 is now deployable, not just announced. If your team works in robotics, simulation, or multi-modal physical AI, download the checkpoints and run your own task-specific evaluation against Nano first. The license framing suggests commercial use is intended, but confirm the terms yourself before committing a production pipeline to the weights. Independent benchmarks aren’t here yet. Build a proof of concept; don’t build infrastructure.