DeepSeek R1 API pricing comparison | DeployCue Skip to content
DeployCue

DeepSeek R1 inference pricing

Developer
DeepSeek
Quality rank
#9
Elo
1370
Context
128K
Weights
Open
Lowest output
$0.990
Lowest output
$0.990
Median
$2.19
Highest
$8.00

3 results

Provider Plan Price Regions Visit
Groq DeepSeek R1 Distill 70B $0.990 $0.750 128K $0.990 /M tokens
Input $0.750/1M tokens
Blended $0.810
Verified
Global Visit →
OpenRouter DeepSeek R1 (routed) $2.19 $0.550 128K $2.19 /M tokens
Input $0.550/1M tokens
Blended $0.960
Verified
Global Visit →
Fireworks AI DeepSeek R1 $8.00 $3.00 128K $8.00 /M tokens
Input $3.00/1M tokens
Blended $4.25
Verified
Global Visit →

Providers serving this model

Fireworks AI specializes in high-throughput open-model inference powered by its custom FireAttention kernel, delivering token generation speeds that routinely beat other hosting platforms. With HIPAA compliance and a broad catalog spanning Llama, DeepSeek, Qwen, and Mistral models, it is built for latency-sensitive production applications at scale.

Groq logo 2

Groq runs inference on custom LPU (Language Processing Unit) silicon rather than GPUs, delivering unmatched tokens-per-second throughput that can make even 70B models feel instant. With ultra-low pricing on Llama and DeepSeek models and a free tier for experimentation, it is the speed leader in the inference market.

OpenRouter acts as a unified gateway that routes API requests across dozens of inference providers - OpenAI, Anthropic, Google, Together, Groq, and more - through a single API key. It automatically selects the best available provider for each model, with transparent pricing and the ability to fallback if one endpoint goes down.

Frequently asked questions

How much does DeepSeek R1 cost per million tokens?
The lowest input price we track for DeepSeek R1 is $0.550 per million tokens. Output tokens cost more; the table shows input, output, and blended pricing for every inference provider.
What is the cheapest DeepSeek R1 API provider?
Sort the table by output or blended price to find the cheapest DeepSeek R1 endpoint. Prices for the same model vary widely between providers, so the cheapest provider can be several times less than the most expensive.
Which providers serve the DeepSeek R1 API?
Every provider with a published DeepSeek R1 endpoint appears above, with input and output token pricing, context window, and throughput.
What is DeepSeek R1's context window?
DeepSeek R1 supports a 128K context window. A larger context window lets you pass more tokens (documents, code, history) in a single request.
Is DeepSeek R1 open weight or closed source?
DeepSeek R1 is an open-weight model, so you can self-host it on any GPU provider, which usually beats managed API pricing at scale.
What is blended LLM cost?
Blended cost weights input and output token prices by a typical 3:1 ratio so you can rank providers by one number instead of comparing two prices separately.