Pricing Strategy and Cost Optimization

2 min read

Flagship model cost comparison: output per 1M tokens (USD)

$0

Cheapest input per 1M tokens (Flash Lite)

0%

Batch API discount

0%

Context caching savings

Gemini APIClaude APIOpenAI API
Flagship model2.5 Pro: $1.25/$10Opus 4.6: $5/$25GPT o3: $2/$8
Mid tier2.5 Flash: $0.30/$2.50Sonnet 4.5: $3/$15GPT 4.1: $2/$8
BudgetFlash Lite: $0.075/$0.30Haiku 4.5: $1/$54.1 mini: $0.40/$1.60
Free tierYes (generous)
Context caching90% savings90% savings50% savings (limited)
Batch discount50% off50% off50% off

Keep Pro prompts under 200K tokens

Gemini 2.5 Pro and 3 Pro models charge double rates for prompts exceeding 200K tokens ($2.50 input and $15 output instead of $1.25/$10). If your prompts regularly exceed this threshold, consider using context caching or restructuring your approach to stay below it.