LLM token cost calculator
Estimate inference spend across providers by input/output volume.
Cheapest option
$0.400
DeepInfra · Llama 3.1 8B - for 10M input + 2M output tokens
Save $2.80 (88%) versus the most expensive of 12 options compared.
All endpoints ranked by total cost
| Provider | Model | In $/M | Out $/M | Total | Go |
|---|---|---|---|---|---|
| DeepInfraCheapest | Llama 3.1 8B | $0.030 | $0.050 | $0.400 | Visit → |
| Groq | Llama 3.1 8B Instant | $0.050 | $0.080 | $0.660 | Visit → |
| Azure AI Foundry | Azure OpenAI GPT-4.1 nano | $0.100 | $0.400 | $1.80 | Visit → |
| OpenAI | GPT-4.1 nano | $0.100 | $0.400 | $1.80 | Visit → |
| OpenRouter | GPT-4.1 nano (routed) | $0.100 | $0.400 | $1.80 | Visit → |
| DeepInfra | Qwen 2.5 72B | $0.130 | $0.400 | $2.10 | Visit → |
| Together AI | Llama 3.1 8B Turbo | $0.180 | $0.180 | $2.16 | Visit → |
| Fireworks AI | Llama 3.1 8B | $0.200 | $0.200 | $2.40 | Visit → |
| DeepInfra | Llama 3.3 70B | $0.230 | $0.400 | $3.10 | Visit → |
| Fireworks AI | Mistral Small 3 | $0.200 | $0.600 | $3.20 | Visit → |
| Mistral AI | Mistral Small 3 | $0.200 | $0.600 | $3.20 | Visit → |
| OpenRouter | Mistral Small 3 (routed) | $0.200 | $0.600 | $3.20 | Visit → |
Totals are (input M × input $/M) + (output M × output $/M) at list prices. They exclude batch discounts, caching, and free tiers. Prices are the latest we have on file per endpoint.