Buy and Apply Reserved GPU Instances

Reserved GPU instances trade flexibility for a lower hourly rate. You commit to a term, often one or three years, and in exchange you pay significantly less than on-demand for the same hardware. The catch is that a reservation only saves money if it matches what you actually run. This tutorial walks through buying a reservation correctly, mapping it to real workloads, and confirming the discount lands on your invoice rather than quietly going to waste.

Reserved Versus On-Demand Versus Spot

Each pricing model fits a different usage pattern. Reservations reward steady baseline demand. On-demand suits unpredictable or short-lived work. Spot suits interruptible jobs that can pause and resume.

Model	Best for	Tradeoff
Reserved	Steady, predictable load	Long commitment, less flexibility
On-demand	Variable or bursty load	Highest hourly rate
Spot	Interruptible batch work	Can be reclaimed anytime

A healthy setup often blends all three: reservations cover the floor, on-demand covers peaks, and spot covers flexible batch jobs.

Size the Commitment to Real Usage

The number one mistake is reserving for hoped-for usage instead of measured usage. Pull at least several weeks of historical utilization and find your steady baseline, the level of GPU demand that is almost always present.

Export usage data for the GPU class you intend to reserve.
Identify the consistent floor, the capacity you use nearly all the time.
Reserve at or slightly below that floor, not at your peak.
Leave the variable portion above the floor on on-demand or spot.

Under-reserving slightly is safer than over-reserving. An idle reservation is money spent on capacity you are not using, while a small gap above the floor is just a few on-demand hours.

Match the Reservation Attributes

Discounts only apply when the reservation matches the running instance. The attributes that must line up usually include:

GPU model and instance type.
Region or zone, depending on the provider's scope rules.
Operating system or platform where relevant.
Tenancy and any quantity or count requirements.

Some providers offer flexible reservations that apply across a family of instances, which reduces the risk of a mismatch. Others require an exact match. Read the scope rules before you commit, because a reservation that does not match anything you run still bills you.

Confirm the Discount Applies

Buying the reservation is not the finish line. You must verify it is actually being consumed.

After purchase, check the billing or cost console for reservation utilization.
Confirm running instances are drawing the reduced rate, not the on-demand rate.
Look for unused reservation hours, which signal a mismatch or idle capacity.
Set a recurring review so utilization does not drift over time.

If utilization is low, find out why. Common causes are a region mismatch, an instance type that drifted during a redeploy, or a workload that was scaled down after the reservation was bought.

Plan for the Term

A reservation is a multi-month or multi-year commitment. Before signing, ask whether the workload will still exist for the full term, whether the GPU class might be superseded by a newer model you would rather use, and whether your provider allows any modification or exchange. Building in a quarterly review keeps reservations aligned with reality as your workloads evolve.

Common Pitfalls

Reserving at peak usage instead of the steady baseline.
Mismatched region or instance type so the discount never applies.
Forgetting to monitor utilization after purchase.
Committing to a long term for a workload that may not last.

Understand Term and Payment Options

Reservations usually come with two dimensions to choose: the length of the term and how you pay for it. Longer terms carry deeper discounts but bind you for longer. Payment structure also affects the effective rate, with more money paid up front typically buying a larger discount than spreading payments over the term.

Choice	Effect on discount	Effect on flexibility
Longer term	Deeper discount	Less flexibility
More paid up front	Larger discount	More capital committed now
Flexible scope	Slightly smaller discount	Applies across more instance types

The right combination depends on how confident you are in the workload and how much up-front spend you can absorb. A team certain of its baseline for years may take the deepest commitment, while a team with a one-year horizon should not sign a three-year term to chase a marginally better rate.

Layer Reservations With Other Discounts

Reservations rarely stand alone. The strongest cost posture layers them with the variable pricing models so each part of demand is served by the cheapest suitable option. The steady floor sits on reservations. Predictable but shorter bursts run on on-demand. Interruptible batch work runs on spot. The art is keeping the reserved layer just below the true floor so it is always fully consumed, then letting the flexible layers flex above it. When you draw the demand curve and place each pricing model under the part of the curve it fits, the blended cost drops well below an all on-demand bill without exposing you to idle reservations.

Review and Adjust on a Cadence

Workloads drift. A model gets retired, traffic grows, a newer GPU becomes the better value. A reservation bought against last quarter's reality can quietly become a poor fit. Set a recurring review, perhaps quarterly, that checks utilization, compares the reserved rate against current alternatives, and confirms the workload still justifies the commitment. Some providers allow modifying or exchanging reservations, so a review may surface an opportunity to shift to a better-fitting type rather than waste an existing commitment. Treating reservations as a living part of your cost strategy, rather than a one-time purchase, is what keeps the savings real over the full term.

Reserved GPU instances are one of the largest discounts available in cloud, but only when the commitment maps to durable, measured demand. Size to your floor, match the attributes precisely, and verify utilization on the bill rather than assuming it works. Pair reservations with on-demand and spot for the variable portion, and revisit the mix on a schedule. Done this way, reservations cut your steady-state GPU cost substantially without locking you into capacity you cannot use.

How to Buy and Apply a Reserved GPU Instance Correctly