Amazon Web Services is the world's largest cloud provider with 200+ services across compute, storage, databases, ML, and networking. Dominates in enterprise with the broadest global region footprint and the deepest service catalog, but pricing complexity and egress fees add up at scale.
NVIDIA H100 cloud pricing
- Vendor
- NVIDIA
- VRAM
- 80 GB
- Architecture
- Hopper
- FP16
- 989 TFLOPS
- Launched
- 2022
27 results
| Provider | Plan | Price | Regions | Visit | ||||
|---|---|---|---|---|---|---|---|---|
|
|
H100 SXM (per GPU) On-demand | 1 | 80 GB | 28 | 180 GB |
$1.95
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
H100 SXM (per GPU) Reserved | 1 | 80 GB | 28 | 180 GB |
$1.27
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
H100 SXM (marketplace) On-demand | 1 | 80 GB | 16 | 128 GB |
$1.99
/GPU-hr
Verified
|
3 countries | Visit → |
|
|
H100 SXM (marketplace) Spot | 1 | 80 GB | 16 | 128 GB |
$1.19
/GPU-hr
Verified
|
3 countries | Visit → |
|
|
H100 SXM (marketplace) Reserved | 1 | 80 GB | 16 | 128 GB |
$1.29
/GPU-hr
Verified
|
3 countries | Visit → |
|
|
H100 SXM (per GPU) On-demand | 1 | 80 GB | 20 | 200 GB |
$2.00
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
H100 SXM (per GPU) Reserved | 1 | 80 GB | 20 | 200 GB |
$1.30
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
HGX H100 (per GPU) On-demand | 1 | 80 GB | 22 | 256 GB |
$2.23
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
HGX H100 (per GPU) Reserved | 1 | 80 GB | 22 | 256 GB |
$1.45
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
H100 machine On-demand | 1 | 80 GB | 20 | 250 GB |
$2.24
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
H100 machine Reserved | 1 | 80 GB | 20 | 250 GB |
$1.46
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
H100 PCIe (Secure Cloud) On-demand | 1 | 80 GB | 16 | 188 GB |
$2.39
/GPU-hr
Verified
|
3 countries | Visit → |
|
|
H100 SXM cluster (per GPU) On-demand | 1 | 80 GB | 20 | 200 GB |
$2.39
/GPU-hr
Verified
|
1 country | Visit → |
|
|
H100 PCIe (Secure Cloud) Reserved | 1 | 80 GB | 16 | 188 GB |
$1.55
/GPU-hr
Verified
|
3 countries | Visit → |
|
|
H100 SXM cluster (per GPU) Reserved | 1 | 80 GB | 20 | 200 GB |
$1.55
/GPU-hr
Verified
|
1 country | Visit → |
|
|
H100 SXM (per GPU) On-demand | 1 | 80 GB | 24 | 240 GB |
$2.45
/GPU-hr
Verified
|
1 country | Visit → |
|
|
H100 SXM (per GPU) Reserved | 1 | 80 GB | 24 | 240 GB |
$1.59
/GPU-hr
Verified
|
1 country | Visit → |
|
|
BM.GPU.H100.8 On-demand | 8 | 640 GB | 112 | 2,048 GB |
$2.90
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
BM.GPU.H100.8 Reserved | 8 | 640 GB | 112 | 2,048 GB |
$1.89
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
H100 SXM 1x On-demand | 1 | 80 GB | 26 | 225 GB |
$2.99
/GPU-hr
Verified
|
1 country | Visit → |
|
|
H100 SXM 1x Reserved | 1 | 80 GB | 26 | 225 GB |
$1.94
/GPU-hr
Verified
|
1 country | Visit → |
|
|
A3 (8x H100) On-demand | 8 | 640 GB | 208 | 1,872 GB |
$3.25
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
A3 (8x H100) Reserved | 8 | 640 GB | 208 | 1,872 GB |
$2.11
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
P5 (8x H100) On-demand | 8 | 640 GB | 192 | 2,048 GB |
$3.40
/GPU-hr
Verified
|
1 country | Visit → |
|
|
P5 (8x H100) Reserved | 8 | 640 GB | 192 | 2,048 GB |
$2.21
/GPU-hr
Verified
|
1 country | Visit → |
|
|
ND H100 v5 (8x H100) On-demand | 8 | 640 GB | 96 | 1,900 GB |
$3.50
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
ND H100 v5 (8x H100) Reserved | 8 | 640 GB | 96 | 1,900 GB |
$2.27
/GPU-hr
Verified
|
2 countries | Visit → |
Providers offering this GPU
Google Cloud Platform combines world-class data analytics, AI infrastructure (TPUs, Vertex AI), and the original managed Kubernetes. Its global fiber backbone and Preemptible VMs offer compelling price-performance for data-heavy and containerized workloads.
Lambda Labs is purpose-built for ML teams - simple, transparent per-hour rates on H100, H200, B200, and GB200 instances with zero hidden fees. Known for responsive support and direct hardware access, it is a top choice for training runs that need predictable pricing without cloud-platform complexity.
CoreWeave is a specialized GPU cloud operator with massive fleets of HGX H100, H200, GB200 NVL72, and B200 systems interconnected with high-speed InfiniBand networking. Purpose-built for large-scale AI training and inference at enterprise-grade reliability, it has become a preferred alternative to hyperscalers for GPU-intensive workloads.
Microsoft Azure is the enterprise cloud tightly woven into the Microsoft ecosystem - Active Directory, Windows Server, Visual Studio, and Microsoft 365. Deep AI partnerships with OpenAI and a massive compliance portfolio make it the default choice for Fortune 500 hybrid deployments.
Modal is a serverless compute platform purpose-built for AI workloads, offering sub-second cold starts, per-second GPU billing, and a Python-native developer experience. Scale-to-zero semantics on H100, A100, and L40S accelerators eliminate idle costs entirely, making it exceptionally cost-efficient for bursty inference, fine-tuning jobs, and scheduled pipelines.
Baseten is a production model-serving platform with built-in autoscaling, per-minute GPU billing, and SOC 2/HIPAA compliance. Designed for teams deploying LLMs and diffusion models at scale, it handles cold starts, traffic spikes, and infrastructure tuning so engineers can focus on model quality rather than platform reliability.
RunPod operates a dual-tier marketplace: community GPUs at ultra-low spot prices and a SOC 2-compliant Secure Cloud for production inference. Per-second billing, instant provisioning, and a broad catalog spanning H100, A100, RTX, and even MI300X accelerators make it flexible for projects of any scale.
Nebius is a European AI cloud provider spun out of Yandex with large clusters of H100 and H200 GPUs in EU data centers. Competitive on-demand pricing, ISO 27001 compliance, and EU data residency make it a compelling choice for European AI startups and enterprises that need sovereignty over their training infrastructure.
Replicate makes it trivially easy to run thousands of open-source AI models via a simple API, billing per second of GPU time with no cold-start penalties. It abstracts away all infrastructure concerns so developers can integrate image generation, video models, speech synthesis, and LLMs with a single line of code.
Together AI provides blazing-fast hosted inference for open-weight models including Llama 3.1 (8B through 405B), DeepSeek V3, Qwen 2.5, and Mistral - all at prices far below closed-model APIs. Its optimized serving infrastructure and free tier for experimentation make it the go-to platform for teams that prefer open models without self-hosting overhead.
Crusoe is a climate-aligned GPU cloud that runs H100, H200, and MI300X workloads on stranded or flare-captured energy, drastically reducing the carbon footprint of AI compute. For teams that care about sustainability without sacrificing performance, it offers enterprise-grade infrastructure with genuine environmental accountability.