Cost Optimization Strategies

2 min read

Prompt caching savings by model family

🚀
90% OFF

GPT-5

90% savings

âš¡
75% OFF

GPT-4.1

75% savings

💰
50% OFF

GPT-4o / o-series

50% savings

GPT-5 input cost per 1M tokens by pricing tier

0%

Max prompt cache savings (GPT-5)

0%

Batch API discount

0%

Savings with smart model routing

Combine strategies for maximum savings

Use Batch API (50% off) with prompt caching (up to 90% off cached inputs) and model routing (cheapest viable model). A workload that costs $1,000/month on GPT-5 Standard with no caching could cost under $200/month with Batch processing, prompt caching, and routing simple subtasks to GPT-5 Nano.