DeepSeek R1 inference pricing

Developer: DeepSeek
Quality rank: #9
Elo: 1370
Context: 128K
Weights: Open
Lowest output: $0.990

Lowest output

$0.990

Median

$2.19

Highest

$8.00

3 results

Provider	Plan	Output $/1M	Input $/1M	Context	Price	Regions	Visit
Groq	DeepSeek R1 Distill 70B	$0.990	$0.750	128K	$0.990 /M tokens Input $0.750/1M tokens Blended $0.810 Verified Jun 20, 2026	Global	Visit →
OpenRouter	DeepSeek R1 (routed)	$2.19	$0.550	128K	$2.19 /M tokens Input $0.550/1M tokens Blended $0.960 Verified Jun 20, 2026	Global	Visit →
Fireworks AI	DeepSeek R1	$8.00	$3.00	128K	$8.00 /M tokens Input $3.00/1M tokens Blended $4.25 Verified Jun 20, 2026	Global	Visit →

Providers serving this model

Fireworks AI

Fireworks AI specializes in high-throughput open-model inference powered by its custom FireAttention kernel, delivering token generation speeds that routinely beat other hosting platforms. With HIPAA compliance and a broad catalog spanning Llama, DeepSeek, Qwen, and Mistral models, it is built for latency-sensitive production applications at scale.

Groq

Groq runs inference on custom LPU (Language Processing Unit) silicon rather than GPUs, delivering unmatched tokens-per-second throughput that can make even 70B models feel instant. With ultra-low pricing on Llama and DeepSeek models and a free tier for experimentation, it is the speed leader in the inference market.

OpenRouter

OpenRouter acts as a unified gateway that routes API requests across dozens of inference providers - OpenAI, Anthropic, Google, Together, Groq, and more - through a single API key. It automatically selects the best available provider for each model, with transparent pricing and the ability to fallback if one endpoint goes down.

Frequently asked questions

How much does DeepSeek R1 cost per million tokens?

The lowest input price we track for DeepSeek R1 is $0.550 per million tokens. Output tokens cost more; the table shows input, output, and blended pricing for every inference provider.

What is the cheapest DeepSeek R1 API provider?

Sort the table by output or blended price to find the cheapest DeepSeek R1 endpoint. Prices for the same model vary widely between providers, so the cheapest provider can be several times less than the most expensive.

Which providers serve the DeepSeek R1 API?

Every provider with a published DeepSeek R1 endpoint appears above, with input and output token pricing, context window, and throughput.

What is DeepSeek R1's context window?

DeepSeek R1 supports a 128K context window. A larger context window lets you pass more tokens (documents, code, history) in a single request.

Is DeepSeek R1 open weight or closed source?

DeepSeek R1 is an open-weight model, so you can self-host it on any GPU provider, which usually beats managed API pricing at scale.

What is blended LLM cost?

Blended cost weights input and output token prices by a typical 3:1 ratio so you can rank providers by one number instead of comparing two prices separately.