AI Models News: Zhipu AI Open-Sources 744B GLM-5.1 With Self-Reported Coding Benchmark Claims, Verification Pending

April 15, 2026 3 min read THUDM GitHub Repository Partial

Tech Jacks Solutions AI News Coverage

Zhipu AI released GLM-5.1 on approximately April 13, 2026, a Mixture-of-Experts model with 40B active parameters per forward pass, available under an MIT license via the THUDM GitHub repository. The company claims GLM-5.1 ranks first on SWE-Bench Pro, outperforming GPT-5.4 and Claude Opus 4.6; those benchmark figures are self-reported and independent evaluation by Epoch AI is underway.

ai-models-news generative-ai-news agentic-ai-news open-source-ai zhipu-ai glm-5 mixture-of-experts swe-bench chinese-ai coding-models

A 744B open-source model dropped this week. The benchmark claims are aggressive. The verification isn’t in yet.

Zhipu AI released GLM-5.1 on approximately April 13, 2026 through the THUDM GitHub organization, the research lab behind the GLM model family. The architecture is a Mixture-of-Experts design, with 40B active parameters engaged per forward pass against a reportedly much larger total parameter pool. That 40B active figure is confirmed via the THUDM repository’s technical documentation. The total parameter count, cited widely as 744B, has not been separately confirmed against the repository’s specifications; treat that figure as reported until verified.

The model is released under an MIT license, per the project repository. Developers can access it now through the THUDM GitHub organization. If you’re evaluating whether to run it, the open weights and permissive license are the most straightforward parts of the story. The benchmark claims require more scrutiny.

What Zhipu AI says about performance, and why the attribution matters. According to self-reported benchmarks, Zhipu AI claims GLM-5.1 ranks first on SWE-Bench Pro, surpassing both GPT-5.4 and Claude Opus 4.6 on that coding evaluation. That’s a significant claim if it holds. It’s also sourced, at least in part, to a third-party aggregator rather than Zhipu’s own technical report, which means the claim hasn’t been fully traced back to the vendor’s primary documentation, let alone confirmed by an independent evaluator. Epoch AI has an evaluation underway. Until those results are published, every performance comparison in this brief should be read as Zhipu AI’s claim, not a verified ranking.

This isn’t a reason to dismiss the release. A 744B MoE model under MIT license is a real development regardless of where the benchmarks ultimately land. But the gap between self-reported and independently verified performance on SWE-Bench Pro is worth holding open, particularly because benchmark claims for Chinese open-source models at this capability tier tend to attract significant scrutiny from the ML research community.

The MoE architecture in context. Mixture-of-Experts designs activate a subset of parameters per inference pass, which is why a nominally 744B model runs on 40B active parameters at any given moment. This makes large MoE models more computationally tractable than their total parameter count implies. It’s the same architectural approach used in several recent frontier releases, including some from Western labs. The distinction matters for developers evaluating whether running GLM-5.1 is practically feasible on their infrastructure, the active parameter count is the relevant figure for most deployment decisions, not the total.

Why this matters for the open-source AI narrative. GLM-5.1 arrives at a moment when the question of whether open-source AI remains competitive with closed frontier models is genuinely contested. Some labs have pulled back from open-weight releases; others have moved toward proprietary architectures even from historically open-source-friendly organizations. A 744B model under MIT license from a Chinese research lab, claiming top-tier coding benchmark performance, is a data point in that conversation, one that cuts against the narrative that open-source AI is falling behind. Whether the data point holds up is what the Epoch evaluation will determine.

What to watch. Epoch AI’s independent evaluation is the key development to track. If results confirm Zhipu’s SWE-Bench Pro claim, this becomes a significant story about the competitive position of open-source Chinese AI models. If results diverge from the self-reported benchmark, that outcome is itself newsworthy, and will test how the hub’s verification-first framing serves readers compared to outlets that ran the claim as fact. The hub will update this brief when Epoch AI’s evaluation is published.

TJS synthesis. Developers who need a large, permissive open-source model for coding tasks have a new option this week. The access story is clean: MIT license, GitHub, available now. The performance story requires patience. Zhipu AI’s SWE-Bench Pro claim is ambitious, single-source, and not yet independently verified. That combination doesn’t make GLM-5.1 less interesting, it makes independent evaluation more consequential. Watch for Epoch AI’s results before treating the benchmark ranking as a settled fact.

Editorial note: This brief reflects Zhipu AI’s self-reported benchmarks. Independent evaluation by Epoch AI is in progress. TJS will update this brief when results are available.

View Source

More Technology intelligence

View all Technology

Deep Dive Available When AI Models Claim Benchmark Leadership: What Claude Opus 4.8 Teaches Practitioners...