DeepSeek V4 Launches: Near-Frontier AI at a Fraction of the Cost

DeepSeek releases V4 Pro and Flash, open-source AI models with 1M context windows that rival GPT-5.5 and Claude at a fraction of the price.

AI Tutorials · · Updated · 4 min read

Quick answer

DeepSeek released V4 Pro and V4 Flash on April 24, 2026 — two open-source AI models that approach the performance of GPT-5.5 and Claude Opus on coding and reasoning benchmarks while costing up to 99% less. Both models support a 1 million token context window and are available under the MIT license.

Chinese AI startup DeepSeek has released V4 Pro and V4 Flash, two open-source models that come remarkably close to matching the best AI systems from OpenAI and Anthropic — at a fraction of the cost. The release, exactly one year after DeepSeek first shook Silicon Valley with its original breakthrough, signals that the gap between open-source and proprietary AI continues to narrow.

What DeepSeek Released

V4 arrives in two flavours. V4 Pro is the flagship: a 1.6 trillion parameter mixture-of-experts model with 49 billion active parameters at any given time. V4 Flash is the lightweight option at 284 billion total parameters and 13 billion active, designed for speed and cost efficiency.

Both models share several headline features. They support a 1 million token context window, letting you feed entire codebases or book-length documents into a single prompt. They use a new Hybrid Attention Architecture that DeepSeek says dramatically improves the model’s ability to track information across long conversations. And both are released under the MIT open-source license, meaning anyone can download, modify, and deploy them.

The models also support dual modes — a Thinking mode for step-by-step reasoning on complex problems, and a standard Non-Thinking mode for fast, direct responses.

How It Compares to GPT-5.5 and Claude

The numbers tell a compelling story. On SWE-bench Verified, a widely used coding benchmark, V4 Pro scores 80.6% — within 0.2 points of Claude Opus 4.6. On Codeforces competitive programming ratings, V4 Pro hits 3,206, edging past GPT-5.4’s 3,168. It also leads Claude on LiveCodeBench (93.5% vs 88.8%) and Terminal-Bench 2.0 (67.9% vs 65.4%).

Where V4 Pro still trails is on advanced knowledge and mathematical reasoning. On Humanity’s Last Exam, it scores 37.7% compared to Claude’s 40.0% and GPT-5.4’s 39.8%. On the HMMT 2026 math competition, Claude (96.2%) and GPT-5.4 (97.7%) both pull ahead of V4 Pro’s 95.2%.

DeepSeek itself acknowledges it trails frontier models by roughly three to six months — but at these prices, many users won’t mind.

The Price Advantage

This is where DeepSeek V4 gets genuinely disruptive. V4 Flash output costs just $0.28 per million tokens — more than 99% cheaper than Claude Opus 4.7. V4 Pro comes in at $3.48 per million output tokens, roughly one-sixth the price of comparable frontier models.

For developers building AI-powered applications, this kind of cost reduction can be the difference between a viable product and an unaffordable one. It also means individual users and researchers can experiment with near-frontier intelligence without burning through credits.

The Bigger Picture

This release lands in a crowded month. OpenAI launched GPT-5.5 just days ago, Anthropic released Claude Opus 4.7 last week, and Google’s Gemma 4 and Meta’s Llama 4 both dropped earlier in April. The AI model market is moving faster than ever, and DeepSeek’s open-source approach keeps pushing proprietary labs to justify their premium pricing.

There are also geopolitical dimensions worth noting. Reports indicate that V4 runs on Huawei’s chips, and Tencent and Alibaba are reportedly in talks to invest in DeepSeek at a high valuation. The model’s success on domestically produced hardware challenges assumptions about how effectively US export controls are limiting Chinese AI capabilities.

What This Means for You

If you use AI tools for coding, writing, or research, DeepSeek V4 gives you another strong option — especially if cost is a factor. The models are available now through the DeepSeek API and on Hugging Face for local deployment.

For everyday users, V4 Pro is a capable alternative to ChatGPT or Claude for most tasks. For developers, the combination of near-frontier performance, MIT licensing, and rock-bottom API pricing makes it worth serious evaluation.

This is still a preview release, so expect rough edges. But the trajectory is clear: open-source AI is no longer a generation behind the frontier — it’s knocking on the door.

To understand how the previous DeepSeek release changed the landscape, see our coverage of the DeepSeek R2 launch. If you’re new to AI tools and want to find the right one for you, check out our getting started guide or subscribe to our newsletter for daily updates.

Frequently asked questions

What is DeepSeek V4?
DeepSeek V4 is a family of two open-source AI models — V4 Pro (1.6 trillion parameters) and V4 Flash (284 billion parameters) — released by Chinese AI startup DeepSeek on April 24, 2026. Both support a 1 million token context window and are released under the MIT license.
How does DeepSeek V4 compare to ChatGPT and Claude?
V4 Pro scores within 0.2 points of Claude Opus 4.6 on SWE-bench coding benchmarks and beats GPT-5.4 on Codeforces ratings. It trails slightly on advanced math and knowledge tests like Humanity's Last Exam, but costs roughly one-sixth the price of comparable frontier models.
Is DeepSeek V4 free to use?
The model weights are free to download and run locally under the MIT open-source license. The DeepSeek API is also available at very low prices — V4 Flash output costs $0.28 per million tokens, while V4 Pro costs $3.48 per million output tokens.
Can I run DeepSeek V4 on my own computer?
The weights are available on Hugging Face, but V4 Pro's 1.6 trillion parameters require significant hardware. V4 Flash at 284 billion parameters is more accessible. Most users will find it easiest to access through the DeepSeek API or third-party platforms.
What is DeepSeek V4's context window?
Both V4 Pro and V4 Flash support a 1 million token context window, which means you can send entire codebases or very long documents as a single prompt. This is comparable to the largest context windows offered by Google's Gemini models.

Want to keep learning?

Explore our guided learning paths or try building something with AI right now.

Enjoyed this article?

Subscribe for more AI insights delivered to your inbox every week.

No spam. Unsubscribe anytime.