0T
Total parameters (V4 Pro)
0B
Active per token (V4 Pro MoE)
0M
Context window (tokens)
| Parameters | Best For | |
|---|---|---|
| R1-Distill-Qwen-1.5B | 1.5B | Mobile devices, edge computing, quick prototypes |
| R1-Distill-Qwen-7B | 7B | Consumer GPUs, local development, cost sensitive deployments |
| R1-Distill-Llama-8B | 8B | Consumer GPUs, Llama ecosystem compatibility |
| R1-Distill-Qwen-14B | 14B | Mid range GPUs, balanced quality and speed |
| R1-Distill-Qwen-32B | 32B | High quality local inference, professional workstations |
| R1-Distill-Llama-70B | 70B | Server deployment, near frontier quality at lower cost |
DeepSeek model history
DeepSeek LLM (V1)
Nov 2023
67B parameters, initial competitive Chinese LLM
DeepSeek V2
May 2024
Introduced MoE + MLA; 236B total / 21B active
DeepSeek V2.5
Sep 2024
Merged chat and coding into unified model
DeepSeek V3
Jan 2025
671B/37B MoE; frontier performance at fraction of cost
DeepSeek R1
Jan 2025
Reasoning model; the 'DeepSeek moment'
R1-0528
May 2025
Updated reasoning with improved accuracy
DeepSeek V3.1
Aug 2025
Hybrid reasoning architecture; single model supports thinking and non thinking modes
DeepSeek V3.2
Dec 2025
Refined V3 across all tasks
DeepSeek V4
Apr 2026
V4 Pro (1.6T/49B) and V4 Flash (284B/13B); 1M context; open sourced
DeepSeek LLM (V1)
Nov 2023
67B parameters, initial competitive Chinese LLM
DeepSeek V2
May 2024
Introduced MoE + MLA; 236B total / 21B active
DeepSeek V2.5
Sep 2024
Merged chat and coding into unified model
DeepSeek V3
Jan 2025
671B/37B MoE; frontier performance at fraction of cost
DeepSeek R1
Jan 2025
Reasoning model; the 'DeepSeek moment'
R1-0528
May 2025
Updated reasoning with improved accuracy
DeepSeek V3.1
Aug 2025
Hybrid reasoning architecture; single model supports thinking and non thinking modes
DeepSeek V3.2
Dec 2025
Refined V3 across all tasks
DeepSeek V4
Apr 2026
V4 Pro (1.6T/49B) and V4 Flash (284B/13B); 1M context; open sourced