Model Sizes and Performance | OpenAI Whisper

	Tiny	Base	Small	Medium	Large-v3	Turbo
Parameters	39M	74M	244M	769M	1.55B	809M
VRAM (approx)	~1 GB	~1 GB	~2 GB	~5 GB	~10 GB	~6 GB
Relative speed	~32x	~16x	~6x	~2x	1x (baseline)	~3x
English WER	~8%	~6%	~4.5%	~3.5%	~2.5%	~2.7%
Translation
Best for	Edge, IoT	CPU inference	Good balance	High accuracy	Maximum accuracy	Production default

GitHub Stars

Languages Supported

0K hrs

Training Data

Monthly HF Downloads (turbo)

The turbo model is the default recommendation

For most production use cases, large-v3-turbo offers the best balance of speed and accuracy. It runs roughly 3x faster than large-v3 with less than 1% difference in word error rate on English. The main limitation is that turbo cannot perform translation (speech in language X to English text). If you need translation, use large-v3 instead.