Gate News message, April 25 — DeepSeek released preview versions of V4-Pro and V4-Flash on April 24, both open-weight models with one million token context windows. V4-Pro features 1.6 trillion total parameters but activates only 49 billion per inference pass using a Mixture-of-Experts architecture. V4-Flash has 284 billion total parameters with 13 billion active.
Pricing is significantly lower than competitors: V4-Pro costs $1.74 per million input tokens and $3.48 per million output tokens—approximately 98% less than OpenAI’s GPT-5.5 Pro ($30 input, $180 output) and roughly one-twentieth the cost of Claude Opus 4.7. V4-Flash is priced at $0.14 input and $0.28 output per million tokens. Both models are open-source under MIT license and can run locally for free.
DeepSeek achieved efficiency gains through two new attention mechanisms: Compressed Sparse Attention and Heavily Compressed Attention, which reduce compute costs to 27% of V4-Pro’s predecessor (V3.2) and 10% for V4-Flash. The company trained V4 partly on Huawei Ascend chips, circumventing U.S. export restrictions on advanced Nvidia processors. DeepSeek stated that once 950 new supernodes come online later in 2026, pricing will drop further.
On performance benchmarks, V4-Pro-Max ranks first on Codeforces competitive programming (3,206 score, placing around 23rd among human contestants) and scores 90.2% on Apex Shortlist math problems versus Claude Opus 4.6’s 85.9%. However, it trails on multitasking benchmarks: MMLU-Pro (87.5% vs Gemini-3.1-Pro’s 91.0%) and Humanity’s Last Exam (37.7% vs 44.4%). On long-context tasks, V4-Pro leads open-source models but loses to Claude Opus 4.6 on MRCR retrieval tests.
V4-Pro introduces “interleaved thinking,” allowing agent workflows to retain reasoning context across multiple tool calls without flushing between steps. Both models support coding integrations with Claude Code and OpenCode. According to DeepSeek’s developer survey of 85 users, 52% said V4-Pro was ready as their default coding agent, with 39% leaning toward adoption. The old deepseek-chat and deepseek-reasoner endpoints will retire on July 24, 2026.
Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to
Disclaimer.
Related Articles
Anthropic’s Claude Dreams: Agents autonomously organize memories between tasks, eliminating duplicates and contradictions
Anthropic announced Dreams at the Code with Claude event, enabling Claude Managed Agents to automatically organize memories across multiple conversations, eliminate duplicates and contradictions, and update outdated entries—outputting an auditable, consolidated memory database. The input limit is 100 sessions and 4,096 characters; it runs asynchronously and is completed in minutes to tens of minutes, with support for streaming observation. Research previews require an application; it currently supports only claude-opus-4-7 and claude-sonnet-4-6, and the official launch date is still undecided.
ChainNewsAbmedia3h ago
Anthropic partners with SpaceX for computing power: secures the entire Colossus 1 facility—220k GPUs—and Claude lifts its usage limits
Anthropic announced a computing power cooperation with SpaceX for the Colossus 1 data center, to deploy more than 220k Nvidia GPUs and over 300MW of capacity. The full rollout is expected within one month for use by Anthropic, improving the compute power and experience of Claude and Code. At the same time, it will loosen the per 5-hour usage limits for Pro/Max/Team/Enterprise, remove peak-hour limits, and increase the Opus API throughput. Infrastructure expansion is also underway across Asia and Europe; in the future, there are also intentions such as “orbital AI computing,” but no deal has been finalized yet.
ChainNewsAbmedia3h ago
Coinbase Engineer: AI Agents Could Disrupt Web Advertising Model
Erik Reppel, a Coinbase engineer, said that artificial intelligence agents could fundamentally undermine the internet's advertising-dependent business model. According to Reppel, the web economy relies heavily on advertising revenue generated from human users, but AI agents bypass that system
CryptoFrontier3h ago
Anthropic Doubles Claude Code Rate Limits After Securing 300MW Capacity from SpaceX Deal
According to Odaily, Anthropic has signed an agreement with SpaceX to access the full computing capacity of the Colossus 1 data center, securing over 300 megawatts of new capacity and more than 220,000 NVIDIA GPUs within the month. Effective immediately, Claude Code's five-hour rate limits for Pro,
GateNews4h ago
OpenAI Unveils the MRC Supercomputer Network Protocol! Teaming Up with NVIDIA, AMD, and Microsoft to Build the Stargate Infrastructure
OpenAI has unveiled the AI supercomputer network protocol MRC, working with AMD, Microsoft, NVIDIA, and others, and open-sourcing it via OCP. MRC splits data and routes it along multiple paths at once, avoids obstacles at the microsecond level, reduces congestion, and maintains GPU synchronization to solve data-transfer bottlenecks in large training clusters. Facilities including Stargate in Texas’s Abilene have already deployed 800Gb/s interfaces and put them into real training.
ChainNewsAbmedia4h ago