Cost Optimization: Caching, Batching, and Model Selection | Anthropic API

Cost per million input tokens by optimization method

Max savings with prompt caching

Batch API discount on all tokens

Max requests per batch

	Haiku 4.5	Sonnet 4.6	Opus 4.6
Input / MTok	$1	$3	$5
Output / MTok	$5	$15	$25
Batch input	$0.50	$1.50	$2.50
Batch output	$2.50	$7.50	$12.50
Cache read (per MTok)	$0.10	$0.30	$0.50
Best for	Routing, classification	Most production workloads	Complex reasoning, agents

Stack your discounts

Prompt caching and the Batch API discounts stack. For a large evaluation run with cached system prompts, you pay the cache read rate (0.1x) on the cached portion, and 50% discount on everything through the Batch API. On an Opus 4.6 workload, this can bring effective input costs from $5/MTok down to $0.25/MTok for cached tokens in batch mode.