This morning (April 24), DeepSeek quietly dropped the V4 preview — simultaneous releases on the official website, App, API, and open source, without any prior notice.
What they delivered is pretty aggressive:
- 1M context free as standard – From today, 1 million tokens of context becomes the default configuration for all official DeepSeek services. No "long context add-on", no extra charge.
- Two versions: V4-Pro with 1.6T total parameters, 49B activated — the flagship; V4-Flash with 284B parameters, 13B activated — focused on speed and low cost.
- New DSA sparse attention mechanism (DeepSeek Sparse Attention) that compresses along the token dimension, significantly cutting compute and VRAM costs for long contexts.
How's the performance? Quoting DeepSeek's official announcement: V4-Pro "outperforms Sonnet 4.5" in Agentic Coding benchmarks, with "delivery quality close to Opus 4.6 non‑thinking mode", and in world knowledge evaluation it is "only slightly behind Gemini-Pro-3.1". If this holds true, it would be the first time an open‑source model goes toe‑to‑toe with closed‑source flagships on the most valuable battleground — coding agents.
We'll have to wait for independent benchmarks from the community to verify. But judging from the technical report, this is not a minor tweak to V3 — it's a structural overhaul.
One more critical detail: The old API model names deepseek-chat and deepseek-reasoner will be deprecated in 3 months (July 24, 2026). Currently they point to the non‑thinking/thinking modes of V4-Flash. Developers who want to continue using them should change the model_name to deepseek-v4-pro or deepseek-v4-flash.
Some personal thoughts – Around this time last year, people were still arguing whether DeepSeek's performance was just faked with data contamination. Less than a year later, the V4 preview makes 1M context free as standard and claims the top open‑source spot for Agentic Coding. They don't rely on PR stunts — they just ship the product, shutting up all the critics.
The model is open‑sourced, and the technical report has been released as well. Now it's up to the community to run reproduction benchmarks.
But this day will likely be another sleepless night for product managers at the big tech firms.