Fireworks AI specializes in high-throughput open-model inference powered by its custom FireAttention kernel, delivering token generation speeds that routinely beat other hosting platforms. With HIPAA compliance and a broad catalog spanning Llama, DeepSeek, Qwen, and Mistral models, it is built for latency-sensitive production applications at scale.
Mistral Small 3 inference pricing
- Developer
- Mistral AI
- Quality rank
- #48
- Elo
- 1240
- Context
- 128K
- Weights
- Open
- Lowest output
- $0.600
3 results
| Provider | Plan | Price | Regions | Visit | |||
|---|---|---|---|---|---|---|---|
| Fireworks AI | Mistral Small 3 | $0.600 | $0.200 | 128K |
$0.600
/M tokens
Input $0.200/1M tokens
Blended $0.300
Verified
|
Global | Visit → |
|
|
Mistral Small 3 | $0.600 | $0.200 | 128K |
$0.600
/M tokens
Input $0.200/1M tokens
Blended $0.300
Verified
|
Global | Visit → |
|
|
Mistral Small 3 (routed) | $0.600 | $0.200 | 128K |
$0.600
/M tokens
Input $0.200/1M tokens
Blended $0.300
Verified
|
Global | Visit → |
Providers serving this model
Mistral AI is a European frontier-lab providing first-party API access to its own Mistral Large 2 and Mistral Small 3 models - both available as open-weight releases. Known for strong multilingual performance, efficient architectures, and EU-based infrastructure with ISO 27001 compliance, it is the leading European alternative to US-based model providers.
OpenRouter acts as a unified gateway that routes API requests across dozens of inference providers - OpenAI, Anthropic, Google, Together, Groq, and more - through a single API key. It automatically selects the best available provider for each model, with transparent pricing and the ability to fallback if one endpoint goes down.