Pricing Strategy and Cost Optimization | Gemini API

Flagship model cost comparison: output per 1M tokens (USD)

Cheapest input per 1M tokens (Flash Lite)

Batch API discount

Context caching savings

	Gemini API	Claude API	OpenAI API
Flagship model	2.5 Pro: $1.25/$10	Opus 4.6: $5/$25	GPT o3: $2/$8
Mid tier	2.5 Flash: $0.30/$2.50	Sonnet 4.5: $3/$15	GPT 4.1: $2/$8
Budget	Flash Lite: $0.075/$0.30	Haiku 4.5: $1/$5	4.1 mini: $0.40/$1.60
Free tier	Yes (generous)
Context caching	90% savings	90% savings	50% savings (limited)
Batch discount	50% off	50% off	50% off

Keep Pro prompts under 200K tokens

Gemini 2.5 Pro and 3 Pro models charge double rates for prompts exceeding 200K tokens ($2.50 input and $15 output instead of $1.25/$10). If your prompts regularly exceed this threshold, consider using context caching or restructuring your approach to stay below it.