Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

DeepSeek

What Is DeepSeek? The Chinese AI That Shocked the World

Prices and benchmarks verified April 2026
$5.576M
V3 official training cost — vs ~$100M estimated for GPT-4
$0.14/$0.28
V4-Flash per million tokens — input/output. 35-100x cheaper than GPT-5.5
#1 App
US iOS App Store on Jan 27, 2025 — surpassing ChatGPT in days
MIT
Open-weight license on all major models — commercial use permitted

On January 27, 2025, DeepSeek's R1 model knocked ChatGPT from the top spot on the US iOS App Store. That same day, Nvidia's stock dropped 18% — a $600 billion market cap loss that stands as the largest single-company decline in US stock market history. s1

A seven-month-old Chinese AI company, staffed by 160 people, had just rattled the entire Western AI industry. This breakdown explains what DeepSeek is, what its models can do, why it caused such disruption, and what the controversies mean for anyone considering using it.


What Is DeepSeek?

DeepSeek is a Chinese AI research company founded on July 17, 2023, in Hangzhou, Zhejiang, China. Its founder and CEO, Liang Wenfeng, also co-founded High-Flyer Capital — a quantitative hedge fund that owns DeepSeek outright. s1

The company operates with roughly 160 employees as of 2025, a fraction of the headcount at OpenAI, Google DeepMind, or Anthropic. Its stated philosophy is maximizing intelligence per unit of compute — a constraint born not from philosophy, but from US chip export restrictions that limit access to cutting-edge Nvidia GPUs.

That constraint became DeepSeek's competitive advantage. For a breakdown of how other frontier AI providers approach compute, see the AI Tools Hub.

Why DeepSeek Matters

The prevailing assumption in Western AI labs through 2024 was that frontier AI required massive compute — hundreds of millions of dollars per training run. DeepSeek V3, released December 2024, was trained for a total of $5.576M (pre-training $5.328M, extension $0.24M, fine-tuning $0.01M), using 2.788 million H800 GPU hours at approximately $2 per hour. GPT-4's training has been estimated at ~$100M — roughly 95% more. s1

Editorial note: The $5.576M figure covers only the official V3 training run as disclosed by DeepSeek. It excludes prior research, ablation experiments, and infrastructure costs. It is a vendor-disclosed figure.

The January 2025 Disruption

DeepSeek R1 launched January 20, 2025. The accompanying free consumer app spread rapidly. By January 26, venture capitalist Marc Andreessen publicly called it "AI's Sputnik moment." By January 27, DeepSeek had surpassed ChatGPT as the most-downloaded app on the US iOS App Store. s1

The market reaction was swift. Nvidia fell 18% in a single session — a $600 billion market cap loss that set a record for the largest single-company stock decline in US history. The concern: if frontier AI could be trained at a fraction of the cost on less powerful chips, the premium placed on Nvidia's H100 and H200 GPUs might be substantially overpriced. s1

Why R1 Was the Catalyst

R1 is a reasoning model that uses chain-of-thought reasoning — visible to users — trained with Group Relative Policy Optimization (GRPO) reinforcement learning. Its benchmark scores matched or exceeded OpenAI's o1: 97.3% on MATH-500 and a 2,029 Elo rating on Codeforces, outperforming 96.3% of human competitors. s2

The R1 research paper was also the first major LLM paper to undergo formal peer review and be published in Nature — a meaningful marker of scientific rigor in a field where most AI research appears as unreviewed preprints. s2

DeepSeek: Key Milestones
JULY 17, 2023
DeepSeek Founded
Liang Wenfeng establishes DeepSeek in Hangzhou, China, as a subsidiary of High-Flyer Capital hedge fund. 160 employees.
DECEMBER 2024
DeepSeek V3 Released
671B parameter MoE model trained for $5.576M — roughly 95% less than GPT-4 estimated cost. MIT License.
JANUARY 20, 2025
R1 Launch + "Sputnik Moment"
R1 reasoning model released. Free app goes live. Marc Andreessen calls it "AI's Sputnik moment" Jan 26. DeepSeek hits #1 US App Store Jan 27. Nvidia loses $600B in market cap.
MAY 2025
R1-0528 Update (Not R2)
Incremental R1 update ships. R2 delayed due to Huawei Ascend hardware instability and slow data labeling. R2 remains unreleased as of April 2026.
APRIL 24, 2026
V4 Preview Released
V4-Pro (1.6T parameters, 1M context) and V4-Flash (284B parameters) launch as preview. MIT License. Not a full GA release.

Model Lineup: V3, R1, and V4

DeepSeek has released three major model families. One anticipated model — R2 — has not been released as of April 2026. Any source citing R2 as a released product is incorrect.

DeepSeek V3 (December 2024)

V3 uses a Mixture-of-Experts (MoE) architecture: 671 billion total parameters, with only 37 billion active per token at inference. This is why V3 runs far cheaper than equivalently-performing dense models. Training cost: $5.576M total. License: MIT. Training data: closed. s1

DeepSeek R1 (January 2025)

R1 is a reasoning model built on V3-base. Where most LLMs produce answers directly, R1 generates a visible chain-of-thought before its final response. The R1-specific fine-tuning cost was $294K on top of the V3 base. Peer-reviewed in Nature. License: MIT. s2

DeepSeek V4 (April 24, 2026 — Preview)

V4 is a preview release — not general availability. Two variants:

  • V4-Pro: 1.6 trillion total parameters, 49B active per token, 1M token context window
  • V4-Flash: 284B total parameters, 13B active per token, 1M token context window

Both use a hybrid attention architecture that achieves the 1M context window at a fraction of normal compute cost. License: MIT. s1

R2 status: DeepSeek R2 has not been released as of April 2026. Delays stem from hardware instability with Huawei Ascend chips and data labeling slowdowns. The R1-0528 update (May 2025) was an incremental improvement to R1, not a new generation.

Open-Weight, Not Open-Source

DeepSeek models are frequently described as "open-source." That description is imprecise by the current industry standard.

DeepSeek releases model weights under the MIT License, which permits full commercial use and modification. However, DeepSeek does not release its training data. The OSI Open Source AI Definition 1.0 (published October 2024) requires training data access as a condition of the "open-source AI" label. DeepSeek's models do not meet this standard. s1

The correct term is open-weight. This distinction matters: an open-weight model can be self-hosted and fine-tuned on your own data. But you cannot reproduce DeepSeek's training from scratch, audit the full data pipeline, or verify what was included in the training corpus.

For most use cases, open-weight is sufficient. For regulated or high-trust environments, the absence of training data transparency is a meaningful gap.


Pricing

Note: Before architecting around DeepSeek's API pricing, review the Controversies section — the API routes data through Chinese servers, which has compliance implications for EU/regulated organizations.

DeepSeek's pricing sits dramatically below Western AI API providers. All figures verified April 2026.

Model / Provider Input (per 1M tokens) Output (per 1M tokens) vs V4-Flash
DeepSeek V4-Flash $0.14 $0.28 Baseline
DeepSeek V4-Pro ~$0.30-$0.50 ~$0.50-$0.87 2-3x
Gemini 3.1 Pro $2.00 $12.00 14-43x
Claude Opus 4.7 $5.00 $25.00 35-90x
GPT-5.5 $5.00 $30.00 35-100x

Chat interface: Free — no subscription required. s1


Benchmark Performance

Note: Benchmark comparisons reflect scores at the time of each model's release. The competitive landscape for AI benchmarks shifts frequently — check LMSYS Chatbot Arena and Papers With Code for current standings.
DeepSeek V4-Pro vs Frontier Models
Scores verified April 2026 — source: s1
SWE-bench Verified (Software Engineering)
DeepSeek V4-Pro 80.6%
Claude Opus 4.6 80.8%
GPT-5.4 79.2%
Terminal Bench 2.0
GPT-5.4 75.1%
DeepSeek V4-Pro 67.9%
Claude Opus 4.6 65.4%
Additional Benchmarks (DeepSeek)
R1 — MATH-500 97.3%
V4-Pro — Codeforces Rating 3,106 (Elite)

Controversies

DeepSeek's rise has been accompanied by serious documented concerns. Any organization considering deployment should understand these clearly before proceeding.

CCP Censorship Built In

DeepSeek's models include censorship aligned with Chinese Communist Party content restrictions. Topics suppressed include the 1989 Tiananmen Square massacre, Falun Gong, Taiwan sovereignty, Tibet, Uyghur treatment, and Hong Kong. The model's internal chain-of-thought — visible in R1 — shows explicit "do not discuss" instructions for these subjects. This censorship is architectural, not configurable via system prompt. s1

Security Vulnerabilities Under Specific Conditions

CrowdStrike security researchers found that when DeepSeek R1 receives politically sensitive trigger words (Tibet, Falun Gong, and similar terms), the likelihood of generating code with severe security vulnerabilities increases by up to 50%. This is not a general vulnerability — it activates specifically when those trigger words appear in prompts. Development teams processing user-generated text that may include such terms should evaluate this risk. s3

GDPR Ban in Italy

Italy's data protection authority, the Garante, banned DeepSeek R1 over GDPR violations. Specific concerns: lack of transparency about data practices, and evidence that user data — including chat history, device details, and typing patterns — is transmitted to servers in China via China Mobile, a company designated by the US government as a Chinese Military Company. The ban remains in effect as of April 2026. s5

Cloud API vs. self-hosted distinction: The Italian ban targeted DeepSeek's cloud service — not the model weights themselves. Organizations in the EU that self-host DeepSeek weights on their own infrastructure avoid the data-to-China-server concern, but must still conduct their own GDPR Data Protection Impact Assessment (DPIA) for AI systems processing personal data. Self-hosting eliminates the cloud data transfer issue but does not automatically make a deployment GDPR-compliant.

Anthropic Distillation Report (February 23, 2026)

Anthropic published research documenting what it described as an industrial-scale campaign to extract training data from Claude. According to Anthropic's report, DeepSeek generated more than 150,000 exchanges with Claude through approximately 24,000 fraudulent accounts — systematically capturing chain-of-thought responses as training data. s4

Editorial note: The distillation figures (150K+ exchanges) are from Anthropic's own report — a primary source with an obvious interest in the outcome. OpenAI has made similar accusations. The underlying practice — training on competitor model outputs without authorization — violates terms of service and is documented across multiple sources.

Export Control Investigation

DeepSeek is under active investigation by the US Department of Commerce for allegedly acquiring restricted Nvidia Blackwell chips through Singapore intermediaries. Three individuals — including a Chinese national — have been charged in Singapore for illegal export of Nvidia chips to DeepSeek. No final ruling has been issued as of April 2026. s1

Key Limitations at a Glance
COMPLIANCE RISK
CCP Censorship — Architectural
Tiananmen, Taiwan, Tibet, Falun Gong, Hong Kong, and Uyghur topics are suppressed at the model level. Chain-of-thought logs show explicit "do not discuss" instructions. Cannot be removed via system prompt.
GDPR BAN
Italy Garante GDPR Ruling
R1 banned in Italy. User data (chat history, device details, typing patterns) transmitted to China via China Mobile — a US-designated Chinese Military Company. GDPR-regulated orgs require legal review before use.
SECURITY RESEARCH
CrowdStrike 50% Vulnerability Trigger
When R1 receives politically sensitive trigger words, likelihood of generating severely vulnerable code increases up to 50% (CrowdStrike). Not a general vulnerability — trigger-specific.
INVESTIGATION
Export Control & Distillation
US DOC investigation ongoing for acquiring restricted Nvidia Blackwell chips via Singapore. Three individuals charged in Singapore. Anthropic also documented 150K+ fraudulent Claude interactions used for training data.

Who Should Use DeepSeek?

How to Access DeepSeek
📂
Local Deployment
Download open-weight models (V3, R1) directly. Run on your own infrastructure. MIT License permits commercial use. V4-Flash (284B) is tractable for self-hosted setups with sufficient GPU resources.
🔗
DeepSeek API
V4-Flash at $0.14/$0.28 per million tokens. V4-Pro at ~$0.30-$0.87. Compatible with OpenAI-style API format. Lowest-cost frontier API option as of April 2026.
💬
Chat Interface
chat.deepseek.com — free, no subscription. Supports R1 chain-of-thought mode and standard chat. Data collection concerns apply (see Controversies).
👥
Third-Party Integrations
DeepSeek models available via multiple third-party providers, cloud platforms, and open-source tooling. Self-hosting via Ollama, LM Studio, and vLLM widely documented.

Strong Fit

  • API-cost-sensitive developers — V4-Flash's $0.14/$0.28 pricing is 35-100x below GPT-5.5 for comparable output quality at scale
  • AI researchers and academics — open-weight access, MIT License, and a peer-reviewed R1 paper make DeepSeek unusually accessible for study
  • Local deployment users — self-hosted models with no API dependency or data routing to external servers
  • Non-censorship-sensitive use cases — coding, data analysis, writing tasks that do not touch suppressed topics

Poor Fit

  • GDPR-regulated organizations — Italy's ban and data routing through China Mobile make deployment a compliance risk without explicit legal clearance
  • Applications requiring uncensored historical or geopolitical content — CCP censorship is architectural and cannot be overridden
  • Security-sensitive applications processing political or government-adjacent text (CrowdStrike trigger finding)
  • Organizations with strict AI supply chain policies — ongoing export control investigation and distillation controversy introduce reputational and legal risk
  • Teams requiring full data lineage — training data not released; provenance cannot be audited

For a detailed comparison of DeepSeek against Western AI alternatives, visit the DeepSeek hub.


Frequently Asked Questions

DeepSeek FAQ
DeepSeek's models are open-weight — weights are freely downloadable under the MIT License for commercial use. However, training data is not released. The OSI Open Source AI Definition 1.0 (October 2024) requires training data disclosure. DeepSeek does not qualify as fully open-source by this standard. The correct term is open-weight.
Context-dependent. DeepSeek has documented CCP censorship for politically sensitive topics, a CrowdStrike-identified security vulnerability that activates under specific trigger conditions, and GDPR data routing concerns. For general coding or writing in non-regulated contexts, practical risks are lower. For GDPR-regulated organizations or security-sensitive deployments, formal legal and security review is required before use.
No. As of April 2026, DeepSeek R2 has not been released. Delays are attributed to hardware instability with Huawei Ascend chips and data labeling challenges. An incremental R1-0528 update shipped in May 2025 but is not a new model generation. Any source describing R2 as a released product is incorrect.
DeepSeek V4-Flash is priced at $0.14 input / $0.28 output per million tokens. GPT-5.5 is $5/$30, Claude Opus 4.7 is $5/$25, and Gemini 3.1 Pro is $2/$12. V4-Flash is 35-100x cheaper than GPT-5.5 and 14-43x cheaper than Gemini 3.1 Pro at the input tier. These are April 2026 figures.
The January 2025 R1 release demonstrated that a frontier-capable AI model could be trained and deployed at dramatically lower cost than Western labs' models — on less powerful hardware. If that approach scales, demand for Nvidia's premium AI chips could be lower than markets had assumed. Nvidia dropped 18% in a single session, losing approximately $600 billion in market cap — a US stock market record for single-company decline.
Yes. DeepSeek's open-weight models, including V3 and R1, can be downloaded and run on local infrastructure. V4-Pro at 1.6 trillion parameters requires substantial hardware. V4-Flash at 284B parameters is more tractable for self-hosted deployment. The MIT License permits commercial local use with no API dependency or data routing to DeepSeek's servers.

Video Resources
DeepSeek R1 Explained: How China Built a Frontier AI for $6M
Placeholder — verify URL before publish
DeepSeek V3 vs GPT-4: Full Benchmark Comparison
Placeholder — verify URL before publish
DeepSeek Controversies: Censorship, GDPR, and Export Controls
Placeholder — verify URL before publish
Before You Use AI
🔒 Your Privacy
DeepSeek collects chat history, device details, and typing patterns. Data is routed to servers in China. The free chat interface does not offer a GDPR-compliant opt-out path. Enterprise and local-deployment users can avoid cloud data routing. Verify your jurisdiction's requirements before use.
🧠 Mental Health & AI Dependency
If you're in crisis, please contact trained professionals — not AI. 988 Suicide & Crisis Lifeline: call or text 988. SAMHSA: 1-800-662-4357. Crisis Text Line: text HOME to 741741. AI tools are not substitutes for professional mental health support. See NIST AI Risk Framework.
⚖ Your Rights & Our Transparency
GDPR/CCPA: You have the right to access, correct, and delete your data. TechJack Solutions maintains editorial independence — no vendor has reviewed or approved this article. We may use affiliate links; these do not influence editorial judgment. This article may be relevant to EU AI Act compliance discussions.