LLM token cost calculator

Estimate inference spend across providers by input/output volume.

Cheapest option

$0.400

DeepInfra · Llama 3.1 8B - for 10M input + 2M output tokens

Visit DeepInfra →

Save $2.80 (88%) versus the most expensive of 12 options compared.

All endpoints ranked by total cost

Provider	Model	In $/M	Out $/M	Total	Go
DeepInfraCheapest	Llama 3.1 8B	$0.030	$0.050	$0.400	Visit →
Groq	Llama 3.1 8B Instant	$0.050	$0.080	$0.660	Visit →
Azure AI Foundry	Azure OpenAI GPT-4.1 nano	$0.100	$0.400	$1.80	Visit →
OpenAI	GPT-4.1 nano	$0.100	$0.400	$1.80	Visit →
OpenRouter	GPT-4.1 nano (routed)	$0.100	$0.400	$1.80	Visit →
DeepInfra	Qwen 2.5 72B	$0.130	$0.400	$2.10	Visit →
Together AI	Llama 3.1 8B Turbo	$0.180	$0.180	$2.16	Visit →
Fireworks AI	Llama 3.1 8B	$0.200	$0.200	$2.40	Visit →
DeepInfra	Llama 3.3 70B	$0.230	$0.400	$3.10	Visit →
Fireworks AI	Mistral Small 3	$0.200	$0.600	$3.20	Visit →
Mistral AI	Mistral Small 3	$0.200	$0.600	$3.20	Visit →
OpenRouter	Mistral Small 3 (routed)	$0.200	$0.600	$3.20	Visit →

Totals are (input M × input $/M) + (output M × output $/M) at list prices. They exclude batch discounts, caching, and free tiers. Prices are the latest we have on file per endpoint.