NVIDIA H100 cloud pricing

Vendor: NVIDIA
VRAM: 80 GB
Architecture: Hopper
FP16: 989 TFLOPS
Launched: 2022

Lowest

$1.19

Median

$2.00

Highest

$3.50

27 results

Provider	Plan	GPUs	VRAM	vCPUs	RAM	Price	Regions	Visit
Hyperstack	H100 SXM (per GPU) On-demand	1	80 GB	28	180 GB	$1.95 /GPU-hr Verified Jun 20, 2026	2 countries	Visit →
Hyperstack	H100 SXM (per GPU) Reserved	1	80 GB	28	180 GB	$1.27 /GPU-hr Verified Jun 20, 2026	2 countries	Visit →
Vast.ai	H100 SXM (marketplace) On-demand	1	80 GB	16	128 GB	$1.99 /GPU-hr Verified Jun 20, 2026	3 countries	Visit →
Vast.ai	H100 SXM (marketplace) Spot	1	80 GB	16	128 GB	$1.19 /GPU-hr Verified Jun 20, 2026	3 countries	Visit →
Vast.ai	H100 SXM (marketplace) Reserved	1	80 GB	16	128 GB	$1.29 /GPU-hr Verified Jun 20, 2026	3 countries	Visit →
Nebius	H100 SXM (per GPU) On-demand	1	80 GB	20	200 GB	$2.00 /GPU-hr Verified Jun 20, 2026	2 countries	Visit →
Nebius	H100 SXM (per GPU) Reserved	1	80 GB	20	200 GB	$1.30 /GPU-hr Verified Jun 20, 2026	2 countries	Visit →
CoreWeave	HGX H100 (per GPU) On-demand	1	80 GB	22	256 GB	$2.23 /GPU-hr Verified Jun 20, 2026	2 countries	Visit →
CoreWeave	HGX H100 (per GPU) Reserved	1	80 GB	22	256 GB	$1.45 /GPU-hr Verified Jun 20, 2026	2 countries	Visit →
Paperspace	H100 machine On-demand	1	80 GB	20	250 GB	$2.24 /GPU-hr Verified Jun 20, 2026	2 countries	Visit →
Paperspace	H100 machine Reserved	1	80 GB	20	250 GB	$1.46 /GPU-hr Verified Jun 20, 2026	2 countries	Visit →
RunPod	H100 PCIe (Secure Cloud) On-demand	1	80 GB	16	188 GB	$2.39 /GPU-hr Verified Jun 20, 2026	3 countries	Visit →
Together AI	H100 SXM cluster (per GPU) On-demand	1	80 GB	20	200 GB	$2.39 /GPU-hr Verified Jun 20, 2026	1 country	Visit →
RunPod	H100 PCIe (Secure Cloud) Reserved	1	80 GB	16	188 GB	$1.55 /GPU-hr Verified Jun 20, 2026	3 countries	Visit →
Together AI	H100 SXM cluster (per GPU) Reserved	1	80 GB	20	200 GB	$1.55 /GPU-hr Verified Jun 20, 2026	1 country	Visit →
Crusoe	H100 SXM (per GPU) On-demand	1	80 GB	24	240 GB	$2.45 /GPU-hr Verified Jun 20, 2026	1 country	Visit →
Crusoe	H100 SXM (per GPU) Reserved	1	80 GB	24	240 GB	$1.59 /GPU-hr Verified Jun 20, 2026	1 country	Visit →
Oracle Cloud Infrastructure	BM.GPU.H100.8 On-demand	8	640 GB	112	2,048 GB	$2.90 /GPU-hr Verified Jun 20, 2026	2 countries	Visit →
Oracle Cloud Infrastructure	BM.GPU.H100.8 Reserved	8	640 GB	112	2,048 GB	$1.89 /GPU-hr Verified Jun 20, 2026	2 countries	Visit →
Lambda	H100 SXM 1x On-demand	1	80 GB	26	225 GB	$2.99 /GPU-hr Verified Jun 20, 2026	1 country	Visit →
Lambda	H100 SXM 1x Reserved	1	80 GB	26	225 GB	$1.94 /GPU-hr Verified Jun 20, 2026	1 country	Visit →
Google Cloud	A3 (8x H100) On-demand	8	640 GB	208	1,872 GB	$3.25 /GPU-hr Verified Jun 20, 2026	2 countries	Visit →
Google Cloud	A3 (8x H100) Reserved	8	640 GB	208	1,872 GB	$2.11 /GPU-hr Verified Jun 20, 2026	2 countries	Visit →
Amazon Web Services	P5 (8x H100) On-demand	8	640 GB	192	2,048 GB	$3.40 /GPU-hr Verified Jun 20, 2026	1 country	Visit →
Amazon Web Services	P5 (8x H100) Reserved	8	640 GB	192	2,048 GB	$2.21 /GPU-hr Verified Jun 20, 2026	1 country	Visit →
Microsoft Azure	ND H100 v5 (8x H100) On-demand	8	640 GB	96	1,900 GB	$3.50 /GPU-hr Verified Jun 20, 2026	2 countries	Visit →
Microsoft Azure	ND H100 v5 (8x H100) Reserved	8	640 GB	96	1,900 GB	$2.27 /GPU-hr Verified Jun 20, 2026	2 countries	Visit →

Providers offering this GPU

Amazon Web Services

Amazon Web Services is the world's largest cloud provider with 200+ services across compute, storage, databases, ML, and networking. Dominates in enterprise with the broadest global region footprint and the deepest service catalog, but pricing complexity and egress fees add up at scale.

Google Cloud

Google Cloud Platform combines world-class data analytics, AI infrastructure (TPUs, Vertex AI), and the original managed Kubernetes. Its global fiber backbone and Preemptible VMs offer compelling price-performance for data-heavy and containerized workloads.

Lambda

Lambda Labs is purpose-built for ML teams - simple, transparent per-hour rates on H100, H200, B200, and GB200 instances with zero hidden fees. Known for responsive support and direct hardware access, it is a top choice for training runs that need predictable pricing without cloud-platform complexity.

CoreWeave

CoreWeave is a specialized GPU cloud operator with massive fleets of HGX H100, H200, GB200 NVL72, and B200 systems interconnected with high-speed InfiniBand networking. Purpose-built for large-scale AI training and inference at enterprise-grade reliability, it has become a preferred alternative to hyperscalers for GPU-intensive workloads.

Microsoft Azure

Microsoft Azure is the enterprise cloud tightly woven into the Microsoft ecosystem - Active Directory, Windows Server, Visual Studio, and Microsoft 365. Deep AI partnerships with OpenAI and a massive compliance portfolio make it the default choice for Fortune 500 hybrid deployments.

Modal

Modal is a serverless compute platform purpose-built for AI workloads, offering sub-second cold starts, per-second GPU billing, and a Python-native developer experience. Scale-to-zero semantics on H100, A100, and L40S accelerators eliminate idle costs entirely, making it exceptionally cost-efficient for bursty inference, fine-tuning jobs, and scheduled pipelines.

Baseten

Baseten is a production model-serving platform with built-in autoscaling, per-minute GPU billing, and SOC 2/HIPAA compliance. Designed for teams deploying LLMs and diffusion models at scale, it handles cold starts, traffic spikes, and infrastructure tuning so engineers can focus on model quality rather than platform reliability.

RunPod

RunPod operates a dual-tier marketplace: community GPUs at ultra-low spot prices and a SOC 2-compliant Secure Cloud for production inference. Per-second billing, instant provisioning, and a broad catalog spanning H100, A100, RTX, and even MI300X accelerators make it flexible for projects of any scale.

Nebius

Nebius is a European AI cloud provider spun out of Yandex with large clusters of H100 and H200 GPUs in EU data centers. Competitive on-demand pricing, ISO 27001 compliance, and EU data residency make it a compelling choice for European AI startups and enterprises that need sovereignty over their training infrastructure.

Replicate

Replicate makes it trivially easy to run thousands of open-source AI models via a simple API, billing per second of GPU time with no cold-start penalties. It abstracts away all infrastructure concerns so developers can integrate image generation, video models, speech synthesis, and LLMs with a single line of code.

Together AI

Together AI provides blazing-fast hosted inference for open-weight models including Llama 3.1 (8B through 405B), DeepSeek V3, Qwen 2.5, and Mistral - all at prices far below closed-model APIs. Its optimized serving infrastructure and free tier for experimentation make it the go-to platform for teams that prefer open models without self-hosting overhead.

Crusoe

Crusoe is a climate-aligned GPU cloud that runs H100, H200, and MI300X workloads on stranded or flare-captured energy, drastically reducing the carbon footprint of AI compute. For teams that care about sustainability without sacrificing performance, it offers enterprise-grade infrastructure with genuine environmental accountability.

Frequently asked questions

How much does an NVIDIA H100 cost per hour in the cloud?

The lowest on-demand NVIDIA H100 price we track is $1.19 per GPU-hour. Spot and reserved rates are usually lower; sort the table above by price to see the current rate from every provider.

What is the cheapest NVIDIA H100 cloud provider?

Sort the table by price (low to high) to see the cheapest NVIDIA H100 provider right now. Marketplace and spot providers often undercut hyperscalers by a wide margin for the same NVIDIA H100.

Which cloud providers offer NVIDIA H100 GPUs?

Every provider with published NVIDIA H100 availability is listed above, with per-hour pricing, the number of GPUs per instance, region coverage, and on-demand, spot, and reserved rates.

Is spot NVIDIA H100 cheaper than on-demand?

Yes. Spot (preemptible) capacity is typically 40-70% cheaper than on-demand but can be reclaimed at short notice. Use the pricing-mode filter to compare on-demand, spot, and reserved rows side by side.

How much VRAM does the NVIDIA H100 have?

The NVIDIA H100 ships with 80 GB of VRAM. Larger VRAM lets you fit bigger models and batch sizes without sharding.

Is the NVIDIA H100 good for AI training and inference?

The NVIDIA H100 is used for both LLM training and inference. Match its VRAM and throughput (shown above) to your model size, and use spot capacity for fault-tolerant training to cut costs.