Flagship model cost comparison: output per 1M tokens (USD)
$0
Cheapest input per 1M tokens (Flash Lite)
0%
Batch API discount
0%
Context caching savings
| Gemini API | Claude API | OpenAI API | |
|---|---|---|---|
| Flagship model | 2.5 Pro: $1.25/$10 | Opus 4.6: $5/$25 | GPT o3: $2/$8 |
| Mid tier | 2.5 Flash: $0.30/$2.50 | Sonnet 4.5: $3/$15 | GPT 4.1: $2/$8 |
| Budget | Flash Lite: $0.075/$0.30 | Haiku 4.5: $1/$5 | 4.1 mini: $0.40/$1.60 |
| Free tier | Yes (generous) | ||
| Context caching | 90% savings | 90% savings | 50% savings (limited) |
| Batch discount | 50% off | 50% off | 50% off |
Keep Pro prompts under 200K tokens
Gemini 2.5 Pro and 3 Pro models charge double rates for prompts exceeding 200K tokens ($2.50 input and $15 output instead of $1.25/$10). If your prompts regularly exceed this threshold, consider using context caching or restructuring your approach to stay below it.