The open vs. closed AI debate has had a measurement problem. Not anymore.
Epoch AI published independent quantitative analysis on May 29 measuring the capability gap between open-weight and closed-weight AI models using the Effective Capability Index (ECI), a methodology designed to compare models across multiple performance dimensions simultaneously, rather than on individual benchmark scores that vendors can cherry-pick.
The finding: open-weight models trail closed-weight SOTA by an average of 8 ECI points (90% confidence interval: 7 to 11 points). Translated to time: that gap corresponds to approximately 4 months of closed-lab capability development. The analysis was authored by Jack Edwards and Luke Emberson and is published under Creative Commons Attribution 4.0, it’s citable and reproducible.
The number that matters most for enterprise decision-making isn’t the 8-point gap. It’s what Epoch’s methodology reveals about how that gap gets measured. Single benchmarks, MMLU, HumanEval, GPQA, and their variants, can be optimized for. A frontier lab can train toward a specific benchmark without improving general capability. ECI aggregates across dimensions to produce a gap estimate that’s harder to game. The 90% confidence interval (7 to 11 points) is wide enough to be honest. It acknowledges uncertainty rather than false precision.
The analysis also carries a finding that Epoch frames as inferential but that deserves direct attention: closed labs likely understate the capability gap by keeping their most advanced models private before release. The most capable closed-weight model publicly available may not be the most capable model in development. Open-weight models are compared against the published frontier. The unpublished frontier may be wider.
This is the real implication for enterprise teams evaluating open vs. closed model deployment. The 4-month figure describes the gap against what’s publicly known. The actual lag against what’s being developed, and what’s coming, may be longer.
The investment signal runs in both directions. For investors in open-source model companies, Mistral, Together AI, and others building on open-weight foundations, this quantification is a structural challenge. The capital concentration that allows closed labs to maintain compute-to-capability advantages, currently on vivid display in Anthropic’s $65B Series H, compounds the gap over time. Open-weight models catch up by being trained on better data and more compute. They’re structurally disadvantaged on the latter.
For enterprise developers, the practical implication is a deployment decision framework question: what capability difference is worth what governance flexibility? Open-weight models offer data privacy, fine-tuning control, and deployment independence. Closed-weight models offer a capability lead that’s now independently measured at 4 months. Neither answer is universal. The measurement makes the trade-off explicit.
Analysis
The 4-month figure measures the gap against the published frontier. Epoch's inferential finding, that closed labs withhold their most capable models from public comparison, suggests the true lag may be longer. Enterprise developers should factor this uncertainty into deployment horizon planning.
Don’t expect the gap to close on its own. Epoch’s data shows it’s a function of compute investment differential, and the compute investment differential is widening, not narrowing. The 4-month lag is a current measurement, not a ceiling.
Watch for Epoch’s next ECI update. If the gap widens in the following analysis period, that’s the indicator that closed-lab capital concentration is compounding faster than open-weight development can absorb. That’s the signal that changes the open vs. closed enterprise deployment calculus at scale.