Open Source AI News: Cosmos 3 Is Official, Nvidia Confirms MoT Architecture, Not MoE, for Its Open-Weights World Model

June 3, 2026 2 min read NVIDIA Cosmos Lab Partial Strong

Tech Jacks Solutions AI News Coverage

Nvidia officially launched Cosmos 3 at GTC Taipei on June 1, confirming the open-weights world foundation model previously reported as a MoE release, the architecture is actually Mixture-of-Transformers, a material correction to earlier coverage. The model processes text, images, video, audio, and action trajectories within a unified architecture and ships under the OpenMDW-1.1 license on Hugging Face.

open-source-ai-news generative-ai-news ai-tools-news nvidia-ai world-models physical-ai robotics-ai open-weights mot-architecture

Alpamayo downloads (per Nvidia), 400,000+

Key Takeaways

Cosmos 3 uses a Mixture-of-Transformers (MoT) architecture, not MoE as previously reported; this is a material correction with practical implications for developers evaluating the model
The model is omnimodal across text, image, video, audio, and action trajectories within a single architecture, confirmed from Nvidia's live Cosmos Lab page
Open weights available now on Hugging Face, via NIM microservices, and GitHub under OpenMDW-1.1 (Linux Foundation), verify license terms before production use; don't assume Apache 2.0 equivalence
Leaderboard rankings (Artificial Analysis #1 open-source T2I/I2V; RoboArena #1 policy model) are vendor-attributed, not independently SVR-verified; Epoch AI evaluation pending

Model Release

Cosmos 3 (Super / Nano)

OrganizationNVIDIA

TypeWorld Model

ParametersNot disclosed

Benchmark[VENDOR-ATTRIBUTED] Artificial Analysis: #1 open-source T2I/I2V | RoboArena: #1 policy model

AvailabilityOpen weights, Hugging Face, NIM microservices, GitHub (OpenMDW-1.1 license)

Architecture Correction: Prior Brief vs. Official Confirmation

Prior reporting (20260602)

Reported as Mixture of Experts (MoE) architecture

→

Official confirmation (20260603)

Confirmed as Mixture-of-Transformers (MoT), distinct architecture with different scaling and compute trade-offs

The earlier brief had it wrong. The model announced at Computex and GTC Taipei as an open-weights Nvidia MoE release is a Mixture-of-Transformers architecture, MoT, not MoE. That’s not a branding distinction. The two approaches have different scaling behaviors, different computational trade-offs, and different practical implications for developers evaluating the model for physical AI applications. Developers who read the prior reporting should update their notes.

Nvidia’s Cosmos Lab page, confirmed live and current, states it directly: “Cosmos 3 connects understanding, generation, simulation, and action through a shared omnimodal world model that moves fluidly across text, images, video, audio, and actions.” Five modalities. One shared architecture. That’s the core technical claim, and it’s verified from the source page.

Two model variants ship: Cosmos 3 Super, designed for high-capacity world simulation, and Cosmos 3 Nano, designed for lightweight policy execution on edge and robotic hardware. Both are available open-weights on Hugging Face, via NVIDIA NIM microservices, and through GitHub. The license is OpenMDW-1.1, administered by the Linux Foundation.

Verification

Partial Nvidia Cosmos Lab page (SVR-verified); leaderboard claims vendor-attributed only Leaderboard rankings (Artificial Analysis, RoboArena) not independently SVR-verified. Epoch AI evaluation pending. OpenMDW-1.1 license terms should be verified at Hugging Face before enterprise deployment.

Leaderboard claims require the standard vendor attribution. According to Nvidia, Cosmos 3 ranks first among open-source models on the Artificial Analysis leaderboard for text-to-image and image-to-video generation, and first on RoboArena for policy model performance. Those rankings haven’t been independently verified through the SVR pipeline, they’re Nvidia’s characterization of third-party leaderboard data. Epoch AI evaluation is pending.

The Cosmos Coalition is the organizational wrapper: Agile Robots, Black Forest Labs, Generalist, LTX, Runway, and Skild AI are named as founding partners in Nvidia’s announcement. According to Nvidia, its Alpamayo reasoning models have surpassed 400,000 downloads.

Unanswered Questions

What are the specific commercial use restrictions and attribution requirements under OpenMDW-1.1 vs. Apache 2.0 or Llama-class licenses?
What are the inference compute requirements for Cosmos 3 Super vs. Nano at production scale, and what does that mean for edge robotics deployment?
Do the Artificial Analysis and RoboArena rankings reflect evaluations on the final released weights, or earlier checkpoint versions?

The catch for enterprise physical AI teams: OpenMDW-1.1 is a newer license from the Linux Foundation, and its specific commercial use terms, attribution requirements, and modification rights need to be confirmed directly at the Hugging Face repository before production deployment. Don’t assume it’s Apache 2.0 equivalent. Verify the terms against your organization’s open-source policy before committing to a pipeline that depends on it.

Cosmos 3 is the most capable open-weights physical AI model currently available for download, if Nvidia’s leaderboard characterizations hold up to independent review. For robotics teams locked into proprietary simulation platforms, the Nano variant in particular is worth a close evaluation look. The open-weights, active community signals here are stronger than most prior Nvidia model releases.