CoreWeave is a specialized GPU cloud operator with massive fleets of HGX H100, H200, GB200 NVL72, and B200 systems interconnected with high-speed InfiniBand networking. Purpose-built for large-scale AI training and inference at enterprise-grade reliability, it has become a preferred alternative to hyperscalers for GPU-intensive workloads.
NVIDIA L40S cloud pricing
- Vendor
- NVIDIA
- VRAM
- 48 GB
- Architecture
- Ada Lovelace
- FP16
- 362 TFLOPS
- Launched
- 2023
6 results
| Provider | Plan | Price | Regions | Visit | ||||
|---|---|---|---|---|---|---|---|---|
|
|
L40S (per GPU) On-demand | 1 | 48 GB | 16 | 90 GB |
$1.00
/GPU-hr
Verified
|
1 country | Visit → |
|
|
L40S (per GPU) On-demand | 1 | 48 GB | 16 | 96 GB |
$1.00
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
L40S (per GPU) Reserved | 1 | 48 GB | 16 | 90 GB |
$0.650
/GPU-hr
Verified
|
1 country | Visit → |
|
|
L40S (per GPU) Reserved | 1 | 48 GB | 16 | 96 GB |
$0.650
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
L40S (per GPU) On-demand | 1 | 48 GB | 16 | 128 GB |
$1.25
/GPU-hr
Verified
|
2 countries | Visit → |
|
|
L40S (per GPU) Reserved | 1 | 48 GB | 16 | 128 GB |
$0.812
/GPU-hr
Verified
|
2 countries | Visit → |
Providers offering this GPU
Modal is a serverless compute platform purpose-built for AI workloads, offering sub-second cold starts, per-second GPU billing, and a Python-native developer experience. Scale-to-zero semantics on H100, A100, and L40S accelerators eliminate idle costs entirely, making it exceptionally cost-efficient for bursty inference, fine-tuning jobs, and scheduled pipelines.
RunPod operates a dual-tier marketplace: community GPUs at ultra-low spot prices and a SOC 2-compliant Secure Cloud for production inference. Per-second billing, instant provisioning, and a broad catalog spanning H100, A100, RTX, and even MI300X accelerators make it flexible for projects of any scale.
Nebius is a European AI cloud provider spun out of Yandex with large clusters of H100 and H200 GPUs in EU data centers. Competitive on-demand pricing, ISO 27001 compliance, and EU data residency make it a compelling choice for European AI startups and enterprises that need sovereignty over their training infrastructure.
Replicate makes it trivially easy to run thousands of open-source AI models via a simple API, billing per second of GPU time with no cold-start penalties. It abstracts away all infrastructure concerns so developers can integrate image generation, video models, speech synthesis, and LLMs with a single line of code.
Hyperstack is a next-generation GPU cloud platform offering H100, A100, B200, L40S, and RTX-class accelerators at aggressive on-demand and reserved rates. With data centers in London and Oslo, Terraform support, and fast API-driven provisioning, it targets teams that want hyperscaler-grade GPU availability without the lock-in.