Model Sizes and Performance

Open source (all sizes free)2 min read
TinyBaseSmallMediumLarge-v3Turbo
Parameters39M74M244M769M1.55B809M
VRAM (approx)~1 GB~1 GB~2 GB~5 GB~10 GB~6 GB
Relative speed~32x~16x~6x~2x1x (baseline)~3x
English WER~8%~6%~4.5%~3.5%~2.5%~2.7%
Translation
Best forEdge, IoTCPU inferenceGood balanceHigh accuracyMaximum accuracyProduction default

0+

GitHub Stars

0

Languages Supported

0K hrs

Training Data

0+

Monthly HF Downloads (turbo)

The turbo model is the default recommendation

For most production use cases, large-v3-turbo offers the best balance of speed and accuracy. It runs roughly 3x faster than large-v3 with less than 1% difference in word error rate on English. The main limitation is that turbo cannot perform translation (speech in language X to English text). If you need translation, use large-v3 instead.