Cost Optimization Strategies | OpenAI API

Prompt caching savings by model family

🚀

90% OFF

GPT-5

90% savings

⚡

75% OFF

GPT-4.1

75% savings

💰

50% OFF

GPT-4o / o-series

50% savings

GPT-5 input cost per 1M tokens by pricing tier

Max prompt cache savings (GPT-5)

Batch API discount

Savings with smart model routing

Combine strategies for maximum savings

Use Batch API (50% off) with prompt caching (up to 90% off cached inputs) and model routing (cheapest viable model). A workload that costs $1,000/month on GPT-5 Standard with no caching could cost under $200/month with Batch processing, prompt caching, and routing simple subtasks to GPT-5 Nano.