Model Sizes and Hardware Requirements | Google Gemma

0M+

Total downloads

0K+

Community variants

Arena AI rating (31B)

	Hardware	Multimodal	Best For
Gemma 4 E2B	Phone, RPi, Jetson Nano	Text + Image + Video + Audio	On device, mobile apps, IoT, offline use
Gemma 4 E4B	Phone, edge devices	Text + Image + Video + Audio	Higher quality edge, on device agents
Gemma 4 26B (MoE, 4B active)	Consumer GPU or workstation	Text + Image + Video	Fast inference, latency sensitive workflows
Gemma 4 31B (Dense)	80GB GPU (bf16) or consumer GPU (quantized)	Text + Image + Video	Maximum quality, fine tuning, agentic workflows

Recommended hardware by model

📱

E2B / E4BSmartphones, Raspberry Pi, NVIDIA Jetson Orin Nano, tablets, and IoT devices. Runs completely offline with near zero latency.

💻

26B (MoE)Consumer GPU or workstation. Only 3.8B parameters active during inference, so it runs fast on modest hardware. Great for latency sensitive use.

⚡

31B (Dense)Single 80GB GPU (H100) for bfloat16, or consumer GPU (RTX 4090, A6000, Apple M3 Max) with quantization. Maximum quality.

Start with the 26B MoE

For most developers exploring Gemma 4 for the first time, the 26B MoE model is an excellent starting point. With only 3.8B active parameters, it delivers fast inference on consumer hardware while scoring 1441 on Arena AI. Scale up to the 31B Dense for maximum quality, or down to E2B/E4B for mobile and edge deployment.