NVIDIA L40S cloud price comparison | DeployCue Skip to content
DeployCue

NVIDIA L40S cloud pricing

Vendor
NVIDIA
VRAM
48 GB
Architecture
Ada Lovelace
FP16
362 TFLOPS
Launched
2023
Lowest
$0.650
Median
$0.906
Highest
$1.25

6 results

Provider Plan Price Regions Visit
Hyperstack L40S (per GPU) On-demand 1 48 GB 16 90 GB $1.00 /GPU-hr
Verified
1 country Visit →
Nebius L40S (per GPU) On-demand 1 48 GB 16 96 GB $1.00 /GPU-hr
Verified
2 countries Visit →
Hyperstack L40S (per GPU) Reserved 1 48 GB 16 90 GB $0.650 /GPU-hr
Verified
1 country Visit →
Nebius L40S (per GPU) Reserved 1 48 GB 16 96 GB $0.650 /GPU-hr
Verified
2 countries Visit →
CoreWeave L40S (per GPU) On-demand 1 48 GB 16 128 GB $1.25 /GPU-hr
Verified
2 countries Visit →
CoreWeave L40S (per GPU) Reserved 1 48 GB 16 128 GB $0.812 /GPU-hr
Verified
2 countries Visit →

Providers offering this GPU

CoreWeave logo 4

CoreWeave is a specialized GPU cloud operator with massive fleets of HGX H100, H200, GB200 NVL72, and B200 systems interconnected with high-speed InfiniBand networking. Purpose-built for large-scale AI training and inference at enterprise-grade reliability, it has become a preferred alternative to hyperscalers for GPU-intensive workloads.

Modal logo 4

Modal is a serverless compute platform purpose-built for AI workloads, offering sub-second cold starts, per-second GPU billing, and a Python-native developer experience. Scale-to-zero semantics on H100, A100, and L40S accelerators eliminate idle costs entirely, making it exceptionally cost-efficient for bursty inference, fine-tuning jobs, and scheduled pipelines.

RunPod logo 4

RunPod operates a dual-tier marketplace: community GPUs at ultra-low spot prices and a SOC 2-compliant Secure Cloud for production inference. Per-second billing, instant provisioning, and a broad catalog spanning H100, A100, RTX, and even MI300X accelerators make it flexible for projects of any scale.

Nebius logo 4

Nebius is a European AI cloud provider spun out of Yandex with large clusters of H100 and H200 GPUs in EU data centers. Competitive on-demand pricing, ISO 27001 compliance, and EU data residency make it a compelling choice for European AI startups and enterprises that need sovereignty over their training infrastructure.

Replicate logo 3

Replicate makes it trivially easy to run thousands of open-source AI models via a simple API, billing per second of GPU time with no cold-start penalties. It abstracts away all infrastructure concerns so developers can integrate image generation, video models, speech synthesis, and LLMs with a single line of code.

Hyperstack is a next-generation GPU cloud platform offering H100, A100, B200, L40S, and RTX-class accelerators at aggressive on-demand and reserved rates. With data centers in London and Oslo, Terraform support, and fast API-driven provisioning, it targets teams that want hyperscaler-grade GPU availability without the lock-in.

Frequently asked questions

How much does an NVIDIA L40S cost per hour in the cloud?
The lowest on-demand NVIDIA L40S price we track is $0.650 per GPU-hour. Spot and reserved rates are usually lower; sort the table above by price to see the current rate from every provider.
What is the cheapest NVIDIA L40S cloud provider?
Sort the table by price (low to high) to see the cheapest NVIDIA L40S provider right now. Marketplace and spot providers often undercut hyperscalers by a wide margin for the same NVIDIA L40S.
Which cloud providers offer NVIDIA L40S GPUs?
Every provider with published NVIDIA L40S availability is listed above, with per-hour pricing, the number of GPUs per instance, region coverage, and on-demand, spot, and reserved rates.
Is spot NVIDIA L40S cheaper than on-demand?
Yes. Spot (preemptible) capacity is typically 40-70% cheaper than on-demand but can be reclaimed at short notice. Use the pricing-mode filter to compare on-demand, spot, and reserved rows side by side.
How much VRAM does the NVIDIA L40S have?
The NVIDIA L40S ships with 48 GB of VRAM. Larger VRAM lets you fit bigger models and batch sizes without sharding.
Is the NVIDIA L40S good for AI training and inference?
The NVIDIA L40S is used for both LLM training and inference. Match its VRAM and throughput (shown above) to your model size, and use spot capacity for fault-tolerant training to cut costs.