0M+
Total downloads
0K+
Community variants
0
Arena AI rating (31B)
| Hardware | Multimodal | Best For | |
|---|---|---|---|
| Gemma 4 E2B | Phone, RPi, Jetson Nano | Text + Image + Video + Audio | On device, mobile apps, IoT, offline use |
| Gemma 4 E4B | Phone, edge devices | Text + Image + Video + Audio | Higher quality edge, on device agents |
| Gemma 4 26B (MoE, 4B active) | Consumer GPU or workstation | Text + Image + Video | Fast inference, latency sensitive workflows |
| Gemma 4 31B (Dense) | 80GB GPU (bf16) or consumer GPU (quantized) | Text + Image + Video | Maximum quality, fine tuning, agentic workflows |
Recommended hardware by model
E2B / E4BSmartphones, Raspberry Pi, NVIDIA Jetson Orin Nano, tablets, and IoT devices. Runs completely offline with near zero latency.
26B (MoE)Consumer GPU or workstation. Only 3.8B parameters active during inference, so it runs fast on modest hardware. Great for latency sensitive use.
31B (Dense)Single 80GB GPU (H100) for bfloat16, or consumer GPU (RTX 4090, A6000, Apple M3 Max) with quantization. Maximum quality.
Start with the 26B MoE
For most developers exploring Gemma 4 for the first time, the 26B MoE model is an excellent starting point. With only 3.8B active parameters, it delivers fast inference on consumer hardware while scoring 1441 on Arena AI. Scale up to the 31B Dense for maximum quality, or down to E2B/E4B for mobile and edge deployment.