DeepSeek released its V4 series on April 24, 2026. The lineup includes three variants: V4 Pro, V4 Flash, and V4 Pro Max. According to DeepSeek’s own positioning, the Pro Max variant is designed for high-complexity reasoning tasks and autonomous workflow execution. V4 Flash targets latency-sensitive inference applications. The release was covered by AP News and The Information, among other outlets.
Every benchmark figure in this brief reflects DeepSeek’s internal evaluation only.
DeepSeek describes V4 Pro Max as outperforming Gemini 3.0-Pro on reasoning benchmarks, according to the company’s own evaluation. DeepSeek has also referenced GPT-5.4 in competitive positioning statements. Neither comparison has been independently verified. Epoch AI’s evaluation of the V4 series is pending as of publication. Until independent evaluation is available, enterprise teams should treat all capability claims as self-reported. That’s not a dismissal of the release, it’s the correct baseline for any major model launch where vendor benchmarks precede third-party review.
DeepSeek describes V4 as featuring agentic capabilities for autonomous workflow execution. This is the company’s stated positioning. Whether V4 Pro Max’s agentic performance holds up against independent task completion benchmarks, the kind that the hub has covered in the context of Anthropic’s agent research, remains to be seen. The hub’s prior coverage of model evaluation methodology provides context for why the gap between vendor benchmarks and third-party evaluation matters in adoption decisions.
The market context is relevant. DeepSeek has positioned V4 as cost-competitive with Western frontier models. No independent cost analysis has been verified. But the cost-efficiency framing arrives on the same day Meta Platforms reported a workforce reduction partly attributed in reporting to AI infrastructure spending, making the cost curve argument anything but abstract. If V4’s cost-efficiency claims hold under independent scrutiny, the compute investment thesis underpinning major technology employer spending decisions becomes more contested.
Availability details, whether V4 is accessible via open weights, API-only, or other deployment methods, have not been independently confirmed as of publication.
For enterprise AI teams, the right question isn’t whether to adopt DeepSeek V4 today. It’s what evidence threshold you require before adopting any frontier model from a non-Western lab – and whether your current evaluation framework accounts for the gap between self-reported benchmarks and independent assessment. The Epoch AI evaluation, when it arrives, will be the relevant data point. Until then, V4’s benchmarks are a claim, not a result.