Open Source & Self Hosting | DeepSeek

	Hardware Needed	Notes
Full V3.2 (671B)	8x NVIDIA H100 80GB	Enterprise grade; multi GPU with NVLink
Quantized V3.2 (4 bit)	4x NVIDIA A100 80GB	Reduced precision with some quality trade off
R1-Distill-70B	1 to 2x A100 or 2x RTX 4090	Feasible for professional workstations
R1-Distill-32B	1x RTX 4090 24GB (quantized)	Good balance of quality and accessibility
R1-Distill-14B	1x RTX 3090 or 4070 Ti	Consumer GPU; workable for personal use
R1-Distill-7B/8B	1x RTX 3060 12GB or Apple M2+	Laptop friendly with quantization
R1-Distill-1.5B	CPU only, any modern computer	Runs anywhere; prototyping and edge devices

Self hosting frameworks

🚀

vLLM

Production API serving with high throughput. The go to choice for deploying models at scale.

🔧

llama.cpp

CPU and mixed CPU/GPU inference. Maximum hardware flexibility and quantization support.

🎯

Ollama

One command install and run. The simplest path to running DeepSeek locally on your machine.

MIT license: no restrictions

Unlike Meta's Llama license (which restricts usage above certain user counts and revenue thresholds), DeepSeek's MIT license imposes no limitations. You can use the models commercially, fine tune them, distill them, and redistribute derivatives without paying royalties. This makes DeepSeek the most permissive frontier AI model family available.

Start small

For most people wanting to try local AI, start with Ollama and the R1-Distill-7B model (requires ~5GB RAM). It runs on almost any modern laptop and gives you a feel for DeepSeek's reasoning capabilities before investing in larger hardware setups.