DeepSeek: DeepSeek V4 Pro
deepseek/deepseek-v4-proAbout
DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. Each MoE layer uses 384 routed experts plus 1 shared expert, with 6 experts active per token. It combines a hybrid attention mechanism (Compressed Sparse Attention and Heavily Compressed Attention) for efficient long-context processing and supports multiple reasoning modes to balance speed and depth. Reported benchmarks include 80.6% on SWE-bench Verified, 67.9% on Terminal-Bench 2.0, and 93.5% on LiveCodeBench. It is suited to full-codebase analysis, multi-step automation, and large-scale information synthesis.
Capabilities
- Context Length
- 1.0M
- Max Output
- 384K
- Reasoning
- Yes
- In
- text
- Out
- text
Benchmarks
View leaderboardReasoning & Knowledge
Coding & Agentic
Source: Artificial Analysis
Pricing
Full pricing| Type | Price / 1M tokens |
|---|---|
| Input | $1.74 |
| Output | $3.48 |
| Cache Read | $0.145 |
OpenAI-compatible · Model ID deepseek/deepseek-v4-pro
curl https://api.elliotgate.com/v1/chat/completions \
-H "Authorization: Bearer sk-omg-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek/deepseek-v4-pro",
"messages": [{"role": "user", "content": "Hello!"}]
}'OFTEN COMPARED
DeepSeek: DeepSeek V4 Pro comparisons
Decide which model wins on the dimensions that matter for your workload — context, benchmarks, pricing, or serving latency.
DeepSeek: DeepSeek V4 Pro vs GLM 5.1
Two Chinese-vendor flagship-tier models with essentially tied benchmark numbers.
See full comparison →DeepSeek: DeepSeek V4 Pro vs GPT-5.2
DeepSeek V4 Pro and GPT-5.
See full comparison →DeepSeek: DeepSeek V4 Pro vs GPT-5.2 Chat
DeepSeek V4 Pro and GPT-5.
See full comparison →DeepSeek: DeepSeek V4 Pro vs Qwen3.6 Max Preview
DeepSeek V4 Pro and Qwen3.
See full comparison →DeepSeek: DeepSeek V4 Pro vs Qwen3.6 Plus
At $0.
See full comparison →DeepSeek: DeepSeek V4 Pro vs Grok 4.3
Two capable reasoning models with overlapping context windows but diverging modalities and speed profiles.
See full comparison →DeepSeek: DeepSeek V4 Pro vs GLM 5
DeepSeek V4 Pro and GLM 5 are both text-only reasoning models built for agentic coding and long-horizon analysis, but their hardware profiles could not be more different.
See full comparison →DeepSeek: DeepSeek V4 Pro vs MiniMax M2.7
DeepSeek V4 Pro and MiniMax M2.
See full comparison →