NativeAIHub

Model Sizes and Hardware Requirements

All sizes free to download1 min read

0M+

Total downloads

0K+

Community variants

0

Arena AI rating (31B)

HardwareMultimodalBest For
Gemma 4 E2BPhone, RPi, Jetson NanoText + Image + Video + AudioOn device, mobile apps, IoT, offline use
Gemma 4 E4BPhone, edge devicesText + Image + Video + AudioHigher quality edge, on device agents
Gemma 4 26B (MoE, 4B active)Consumer GPU or workstationText + Image + VideoFast inference, latency sensitive workflows
Gemma 4 31B (Dense)80GB GPU (bf16) or consumer GPU (quantized)Text + Image + VideoMaximum quality, fine tuning, agentic workflows

Recommended hardware by model

📱
E2B / E4BSmartphones, Raspberry Pi, NVIDIA Jetson Orin Nano, tablets, and IoT devices. Runs completely offline with near zero latency.
💻
26B (MoE)Consumer GPU or workstation. Only 3.8B parameters active during inference, so it runs fast on modest hardware. Great for latency sensitive use.
31B (Dense)Single 80GB GPU (H100) for bfloat16, or consumer GPU (RTX 4090, A6000, Apple M3 Max) with quantization. Maximum quality.

Start with the 26B MoE

For most developers exploring Gemma 4 for the first time, the 26B MoE model is an excellent starting point. With only 3.8B active parameters, it delivers fast inference on consumer hardware while scoring 1441 on Arena AI. Scale up to the 31B Dense for maximum quality, or down to E2B/E4B for mobile and edge deployment.