What Is DeepSeek? The Chinese AI That Shocked the World
On January 27, 2025, DeepSeek's R1 model knocked ChatGPT from the top spot on the US iOS App Store. That same day, Nvidia's stock dropped 18% — a $600 billion market cap loss that stands as the largest single-company decline in US stock market history. s1
A seven-month-old Chinese AI company, staffed by 160 people, had just rattled the entire Western AI industry. This breakdown explains what DeepSeek is, what its models can do, why it caused such disruption, and what the controversies mean for anyone considering using it.
What Is DeepSeek?
DeepSeek is a Chinese AI research company founded on July 17, 2023, in Hangzhou, Zhejiang, China. Its founder and CEO, Liang Wenfeng, also co-founded High-Flyer Capital — a quantitative hedge fund that owns DeepSeek outright. s1
The company operates with roughly 160 employees as of 2025, a fraction of the headcount at OpenAI, Google DeepMind, or Anthropic. Its stated philosophy is maximizing intelligence per unit of compute — a constraint born not from philosophy, but from US chip export restrictions that limit access to cutting-edge Nvidia GPUs.
That constraint became DeepSeek's competitive advantage. For a breakdown of how other frontier AI providers approach compute, see the AI Tools Hub.
Why DeepSeek Matters
The prevailing assumption in Western AI labs through 2024 was that frontier AI required massive compute — hundreds of millions of dollars per training run. DeepSeek V3, released December 2024, was trained for a total of $5.576M (pre-training $5.328M, extension $0.24M, fine-tuning $0.01M), using 2.788 million H800 GPU hours at approximately $2 per hour. GPT-4's training has been estimated at ~$100M — roughly 95% more. s1
The January 2025 Disruption
DeepSeek R1 launched January 20, 2025. The accompanying free consumer app spread rapidly. By January 26, venture capitalist Marc Andreessen publicly called it "AI's Sputnik moment." By January 27, DeepSeek had surpassed ChatGPT as the most-downloaded app on the US iOS App Store. s1
The market reaction was swift. Nvidia fell 18% in a single session — a $600 billion market cap loss that set a record for the largest single-company stock decline in US history. The concern: if frontier AI could be trained at a fraction of the cost on less powerful chips, the premium placed on Nvidia's H100 and H200 GPUs might be substantially overpriced. s1
Why R1 Was the Catalyst
R1 is a reasoning model that uses chain-of-thought reasoning — visible to users — trained with Group Relative Policy Optimization (GRPO) reinforcement learning. Its benchmark scores matched or exceeded OpenAI's o1: 97.3% on MATH-500 and a 2,029 Elo rating on Codeforces, outperforming 96.3% of human competitors. s2
The R1 research paper was also the first major LLM paper to undergo formal peer review and be published in Nature — a meaningful marker of scientific rigor in a field where most AI research appears as unreviewed preprints. s2
Model Lineup: V3, R1, and V4
DeepSeek has released three major model families. One anticipated model — R2 — has not been released as of April 2026. Any source citing R2 as a released product is incorrect.
DeepSeek V3 (December 2024)
V3 uses a Mixture-of-Experts (MoE) architecture: 671 billion total parameters, with only 37 billion active per token at inference. This is why V3 runs far cheaper than equivalently-performing dense models. Training cost: $5.576M total. License: MIT. Training data: closed. s1
DeepSeek R1 (January 2025)
R1 is a reasoning model built on V3-base. Where most LLMs produce answers directly, R1 generates a visible chain-of-thought before its final response. The R1-specific fine-tuning cost was $294K on top of the V3 base. Peer-reviewed in Nature. License: MIT. s2
DeepSeek V4 (April 24, 2026 — Preview)
V4 is a preview release — not general availability. Two variants:
- V4-Pro: 1.6 trillion total parameters, 49B active per token, 1M token context window
- V4-Flash: 284B total parameters, 13B active per token, 1M token context window
Both use a hybrid attention architecture that achieves the 1M context window at a fraction of normal compute cost. License: MIT. s1
Open-Weight, Not Open-Source
DeepSeek models are frequently described as "open-source." That description is imprecise by the current industry standard.
DeepSeek releases model weights under the MIT License, which permits full commercial use and modification. However, DeepSeek does not release its training data. The OSI Open Source AI Definition 1.0 (published October 2024) requires training data access as a condition of the "open-source AI" label. DeepSeek's models do not meet this standard. s1
The correct term is open-weight. This distinction matters: an open-weight model can be self-hosted and fine-tuned on your own data. But you cannot reproduce DeepSeek's training from scratch, audit the full data pipeline, or verify what was included in the training corpus.
For most use cases, open-weight is sufficient. For regulated or high-trust environments, the absence of training data transparency is a meaningful gap.
Pricing
DeepSeek's pricing sits dramatically below Western AI API providers. All figures verified April 2026.
| Model / Provider | Input (per 1M tokens) | Output (per 1M tokens) | vs V4-Flash |
|---|---|---|---|
| DeepSeek V4-Flash | $0.14 | $0.28 | Baseline |
| DeepSeek V4-Pro | ~$0.30-$0.50 | ~$0.50-$0.87 | 2-3x |
| Gemini 3.1 Pro | $2.00 | $12.00 | 14-43x |
| Claude Opus 4.7 | $5.00 | $25.00 | 35-90x |
| GPT-5.5 | $5.00 | $30.00 | 35-100x |
Chat interface: Free — no subscription required. s1
Benchmark Performance
Controversies
DeepSeek's rise has been accompanied by serious documented concerns. Any organization considering deployment should understand these clearly before proceeding.
CCP Censorship Built In
DeepSeek's models include censorship aligned with Chinese Communist Party content restrictions. Topics suppressed include the 1989 Tiananmen Square massacre, Falun Gong, Taiwan sovereignty, Tibet, Uyghur treatment, and Hong Kong. The model's internal chain-of-thought — visible in R1 — shows explicit "do not discuss" instructions for these subjects. This censorship is architectural, not configurable via system prompt. s1
Security Vulnerabilities Under Specific Conditions
CrowdStrike security researchers found that when DeepSeek R1 receives politically sensitive trigger words (Tibet, Falun Gong, and similar terms), the likelihood of generating code with severe security vulnerabilities increases by up to 50%. This is not a general vulnerability — it activates specifically when those trigger words appear in prompts. Development teams processing user-generated text that may include such terms should evaluate this risk. s3
GDPR Ban in Italy
Italy's data protection authority, the Garante, banned DeepSeek R1 over GDPR violations. Specific concerns: lack of transparency about data practices, and evidence that user data — including chat history, device details, and typing patterns — is transmitted to servers in China via China Mobile, a company designated by the US government as a Chinese Military Company. The ban remains in effect as of April 2026. s5
Anthropic Distillation Report (February 23, 2026)
Anthropic published research documenting what it described as an industrial-scale campaign to extract training data from Claude. According to Anthropic's report, DeepSeek generated more than 150,000 exchanges with Claude through approximately 24,000 fraudulent accounts — systematically capturing chain-of-thought responses as training data. s4
Export Control Investigation
DeepSeek is under active investigation by the US Department of Commerce for allegedly acquiring restricted Nvidia Blackwell chips through Singapore intermediaries. Three individuals — including a Chinese national — have been charged in Singapore for illegal export of Nvidia chips to DeepSeek. No final ruling has been issued as of April 2026. s1
Who Should Use DeepSeek?
Strong Fit
- API-cost-sensitive developers — V4-Flash's $0.14/$0.28 pricing is 35-100x below GPT-5.5 for comparable output quality at scale
- AI researchers and academics — open-weight access, MIT License, and a peer-reviewed R1 paper make DeepSeek unusually accessible for study
- Local deployment users — self-hosted models with no API dependency or data routing to external servers
- Non-censorship-sensitive use cases — coding, data analysis, writing tasks that do not touch suppressed topics
Poor Fit
- GDPR-regulated organizations — Italy's ban and data routing through China Mobile make deployment a compliance risk without explicit legal clearance
- Applications requiring uncensored historical or geopolitical content — CCP censorship is architectural and cannot be overridden
- Security-sensitive applications processing political or government-adjacent text (CrowdStrike trigger finding)
- Organizations with strict AI supply chain policies — ongoing export control investigation and distillation controversy introduce reputational and legal risk
- Teams requiring full data lineage — training data not released; provenance cannot be audited
For a detailed comparison of DeepSeek against Western AI alternatives, visit the DeepSeek hub.