Skip to content
Seedance 2.0 Face is here — generate video from real-person reference photos.Try it now
Back to Models
deepseekdeepseek·Apr 24, 2026

DeepSeek: DeepSeek V4 Flash

deepseek/deepseek-v4-flash
context
1.0M
Max Output
384K
Input / 1M
$0.0983
Output / 1M
$0.1966

About

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It shares the V4 architecture with DeepSeek V4 Pro, including a hybrid attention mechanism (Compressed Sparse Attention and Heavily Compressed Attention) for efficient long-context processing and configurable reasoning modes. The design targets fast inference and high-throughput workloads while maintaining reasoning and coding performance, making it suitable for coding assistants, chat systems, and agent workflows where responsiveness and cost efficiency matter.

Capabilities

Context Length
1.0M
Max Output
384K
Reasoning
Yes
In
text
Out
text
40.3#24 of 133
Intelligence Index
38.7#39 of 118
Coding Index

Reasoning & Knowledge

GPQA Diamond89.4%
HLE32.1%

Coding & Agentic

SciCode44.9%
Terminal-Bench Hard35.6%

Source: Artificial Analysis

TypePrice / 1M tokens
Input$0.0983
Output$0.1966
Cache Read$0.0197

OpenAI-compatible · Model ID deepseek/deepseek-v4-flash

curl https://api.elliotgate.com/v1/chat/completions \
  -H "Authorization: Bearer sk-omg-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v4-flash",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'