NVIDIA B200 cloud price comparison | DeployCue Skip to content
DeployCue

NVIDIA B200 cloud pricing

Vendor
NVIDIA
VRAM
192 GB
Architecture
Blackwell
FP16
2250 TFLOPS
Launched
2025
Lowest
$2.27
Median
$3.37
Highest
$4.99

8 results

Provider Plan Price Regions Visit
Hyperstack B200 (per GPU) On-demand 1 180 GB 28 283 GB $3.50 /GPU-hr
Verified
1 country Visit →
Hyperstack B200 (per GPU) Reserved 1 180 GB 28 283 GB $2.27 /GPU-hr
Verified
1 country Visit →
Together AI B200 cluster (per GPU) On-demand 1 180 GB 28 283 GB $3.99 /GPU-hr
Verified
1 country Visit →
Together AI B200 cluster (per GPU) Reserved 1 180 GB 28 283 GB $2.59 /GPU-hr
Verified
1 country Visit →
CoreWeave HGX B200 (per GPU) On-demand 1 180 GB 28 384 GB $4.25 /GPU-hr
Verified
1 country Visit →
CoreWeave HGX B200 (per GPU) Reserved 1 180 GB 28 384 GB $2.76 /GPU-hr
Verified
1 country Visit →
Lambda B200 1x On-demand 1 180 GB 28 283 GB $4.99 /GPU-hr
Verified
1 country Visit →
Lambda B200 1x Reserved 1 180 GB 28 283 GB $3.24 /GPU-hr
Verified
1 country Visit →

Providers offering this GPU

Lambda logo 4

Lambda Labs is purpose-built for ML teams - simple, transparent per-hour rates on H100, H200, B200, and GB200 instances with zero hidden fees. Known for responsive support and direct hardware access, it is a top choice for training runs that need predictable pricing without cloud-platform complexity.

CoreWeave logo 4

CoreWeave is a specialized GPU cloud operator with massive fleets of HGX H100, H200, GB200 NVL72, and B200 systems interconnected with high-speed InfiniBand networking. Purpose-built for large-scale AI training and inference at enterprise-grade reliability, it has become a preferred alternative to hyperscalers for GPU-intensive workloads.

Together AI provides blazing-fast hosted inference for open-weight models including Llama 3.1 (8B through 405B), DeepSeek V3, Qwen 2.5, and Mistral - all at prices far below closed-model APIs. Its optimized serving infrastructure and free tier for experimentation make it the go-to platform for teams that prefer open models without self-hosting overhead.

Hyperstack is a next-generation GPU cloud platform offering H100, A100, B200, L40S, and RTX-class accelerators at aggressive on-demand and reserved rates. With data centers in London and Oslo, Terraform support, and fast API-driven provisioning, it targets teams that want hyperscaler-grade GPU availability without the lock-in.

Frequently asked questions

How much does an NVIDIA B200 cost per hour in the cloud?
The lowest on-demand NVIDIA B200 price we track is $2.27 per GPU-hour. Spot and reserved rates are usually lower; sort the table above by price to see the current rate from every provider.
What is the cheapest NVIDIA B200 cloud provider?
Sort the table by price (low to high) to see the cheapest NVIDIA B200 provider right now. Marketplace and spot providers often undercut hyperscalers by a wide margin for the same NVIDIA B200.
Which cloud providers offer NVIDIA B200 GPUs?
Every provider with published NVIDIA B200 availability is listed above, with per-hour pricing, the number of GPUs per instance, region coverage, and on-demand, spot, and reserved rates.
Is spot NVIDIA B200 cheaper than on-demand?
Yes. Spot (preemptible) capacity is typically 40-70% cheaper than on-demand but can be reclaimed at short notice. Use the pricing-mode filter to compare on-demand, spot, and reserved rows side by side.
How much VRAM does the NVIDIA B200 have?
The NVIDIA B200 ships with 192 GB of VRAM. Larger VRAM lets you fit bigger models and batch sizes without sharding.
Is the NVIDIA B200 good for AI training and inference?
The NVIDIA B200 is used for both LLM training and inference. Match its VRAM and throughput (shown above) to your model size, and use spot capacity for fault-tolerant training to cut costs.