Prompt caching savings by model family
🚀
90% OFF
GPT-5
90% savings
âš¡
75% OFF
GPT-4.1
75% savings
💰
50% OFF
GPT-4o / o-series
50% savings
GPT-5 input cost per 1M tokens by pricing tier
0%
Max prompt cache savings (GPT-5)
0%
Batch API discount
0%
Savings with smart model routing
Combine strategies for maximum savings
Use Batch API (50% off) with prompt caching (up to 90% off cached inputs) and model routing (cheapest viable model). A workload that costs $1,000/month on GPT-5 Standard with no caching could cost under $200/month with Batch processing, prompt caching, and routing simple subtasks to GPT-5 Nano.