Amazon Web Services is the world's largest cloud provider with 200+ services across compute, storage, databases, ML, and networking. Dominates in enterprise with the broadest global region footprint and the deepest service catalog, but pricing complexity and egress fees add up at scale.
NVIDIA A100 80GB cloud pricing
- Vendor
- NVIDIA
- VRAM
- 80 GB
- Architecture
- Ampere
- FP16
- 312 TFLOPS
- Launched
- 2021
17 results
| Provider | Plan | Price | Regions | Visit | ||||
|---|---|---|---|---|---|---|---|---|
|
|
A100 80GB machine On-demand | 1 | 80 GB | 12 | 90 GB |
$1.15
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
A100 80GB machine Reserved | 1 | 80 GB | 12 | 90 GB |
$0.748
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
A100 80GB (per GPU) On-demand | 1 | 80 GB | 28 | 120 GB |
$1.35
/GPU-hr
Verified
|
1 country | Visit → |
|
|
A100 80GB (marketplace) On-demand | 1 | 80 GB | 12 | 96 GB |
$1.35
/GPU-hr
Verified
|
3 countries | Visit → |
|
|
A100 80GB (marketplace) Spot | 1 | 80 GB | 12 | 96 GB |
$0.810
/GPU-hr
Verified
|
3 countries | Visit → |
|
|
A100 80GB (per GPU) Reserved | 1 | 80 GB | 28 | 120 GB |
$0.877
/GPU-hr
Verified
|
1 country | Visit → |
|
|
A100 80GB (marketplace) Reserved | 1 | 80 GB | 12 | 96 GB |
$0.877
/GPU-hr
Verified
|
3 countries | Visit → |
|
|
A100 80GB (Secure Cloud) On-demand | 1 | 80 GB | 16 | 125 GB |
$1.89
/GPU-hr
Verified
|
3 countries | Visit → |
|
|
A100 80GB (Secure Cloud) Spot | 1 | 80 GB | 16 | 125 GB |
$1.04
/GPU-hr
Verified
|
3 countries | Visit → |
|
|
A100 80GB (Secure Cloud) Reserved | 1 | 80 GB | 16 | 125 GB |
$1.23
/GPU-hr
Verified
|
3 countries | Visit → |
|
|
A100 80GB cluster (per GPU) On-demand | 1 | 80 GB | 16 | 128 GB |
$2.00
/GPU-hr
Verified
|
1 country | Visit → |
|
|
A100 80GB cluster (per GPU) Reserved | 1 | 80 GB | 16 | 128 GB |
$1.30
/GPU-hr
Verified
|
1 country | Visit → |
|
|
BM.GPU.A100.80 On-demand | 8 | 640 GB | 128 | 2,048 GB |
$2.40
/GPU-hr
Verified
|
1 country | Visit → |
|
|
BM.GPU.A100.80 Reserved | 8 | 640 GB | 128 | 2,048 GB |
$1.56
/GPU-hr
Verified
|
1 country | Visit → |
|
|
P4de (8x A100 80GB) On-demand | 8 | 640 GB | 96 | 1,152 GB |
$2.60
/GPU-hr
Verified
|
1 country | Visit → |
|
|
P4de (8x A100 80GB) Spot | 8 | 640 GB | 96 | 1,152 GB |
$1.17
/GPU-hr
Verified
|
1 country | Visit → |
|
|
P4de (8x A100 80GB) Reserved | 8 | 640 GB | 96 | 1,152 GB |
$1.69
/GPU-hr
Verified
|
1 country | Visit → |
Providers offering this GPU
Modal is a serverless compute platform purpose-built for AI workloads, offering sub-second cold starts, per-second GPU billing, and a Python-native developer experience. Scale-to-zero semantics on H100, A100, and L40S accelerators eliminate idle costs entirely, making it exceptionally cost-efficient for bursty inference, fine-tuning jobs, and scheduled pipelines.
Baseten is a production model-serving platform with built-in autoscaling, per-minute GPU billing, and SOC 2/HIPAA compliance. Designed for teams deploying LLMs and diffusion models at scale, it handles cold starts, traffic spikes, and infrastructure tuning so engineers can focus on model quality rather than platform reliability.
RunPod operates a dual-tier marketplace: community GPUs at ultra-low spot prices and a SOC 2-compliant Secure Cloud for production inference. Per-second billing, instant provisioning, and a broad catalog spanning H100, A100, RTX, and even MI300X accelerators make it flexible for projects of any scale.
Replicate makes it trivially easy to run thousands of open-source AI models via a simple API, billing per second of GPU time with no cold-start penalties. It abstracts away all infrastructure concerns so developers can integrate image generation, video models, speech synthesis, and LLMs with a single line of code.
Together AI provides blazing-fast hosted inference for open-weight models including Llama 3.1 (8B through 405B), DeepSeek V3, Qwen 2.5, and Mistral - all at prices far below closed-model APIs. Its optimized serving infrastructure and free tier for experimentation make it the go-to platform for teams that prefer open models without self-hosting overhead.
Oracle Cloud Infrastructure competes aggressively on price with consistent 10x-lower egress than AWS and competitively priced bare-metal GPU instances. Strong Oracle database integration, zero-cost Kubernetes control planes, and a growing AI footprint make it appealing for database-heavy and GPU-intensive workloads.
Hyperstack is a next-generation GPU cloud platform offering H100, A100, B200, L40S, and RTX-class accelerators at aggressive on-demand and reserved rates. With data centers in London and Oslo, Terraform support, and fast API-driven provisioning, it targets teams that want hyperscaler-grade GPU availability without the lock-in.
Paperspace, now part of DigitalOcean, offers GPU cloud compute with a developer-friendly notebook environment and a genuine free GPU tier for experimentation. With predictable on-demand pricing across H100, A100, and A6000 instances, it bridges the gap between zero-setup Jupyter notebooks and production-grade GPU infrastructure.
Vast.ai is a decentralized GPU marketplace where host operators list idle accelerators at market-driven prices - often the lowest spot rates anywhere. The trade-off is variable reliability, no compliance certs, and community-grade support; ideal for cost-sensitive experimentation, fine-tuning, and batch inference where occasional interruption is acceptable.