About instances that use the reservation-bound model

This document describes Compute Engine instances that use the reservation-bound provisioning model, including their benefits and creation requirements.

When you create a compute instance, you must specify the underlying provisioning model, which defines the availability, pricing, and lifespan for the resources that your compute instance uses. The reservation-bound provisioning model lets you create A4X Max, A4X, A4, A3 Ultra, A3 Mega, A3 High with 8 GPUs, A3 Edge, and H4D instances by using reserved capacity from a future reservation in calendar mode or future reservation in AI Hypercomputer.

The reservation-bound provisioning model offers the following benefits:

  • Cost control: you don't incur additional charges when you create compute instances by using reserved capacity. You only incur charges for resources that aren't part of your reservation, such as disks or IP addresses.

  • Lifecycle management: based on the termination action that you specify when you create compute instances, Compute Engine stops or deletes the compute instances at the end of the reservation period.

Understand instances that use the reservation-bound provisioning model

The following sections describe the requirements that you apply when you create compute instances by using the reservation-bound provisioning model.

Compute instance creation prerequisites

To use the reservation-bound provisioning model to create compute instances, you must first reserve resources. You can reserve resources as follows:

If Google Cloud approves your future reservation request, then Compute Engine automatically creates (auto-creates) a reservation at the start of your reservation period. You can then use the reservation to create compute instances.

Compute instance creation requirements

To create a compute instance by using the reservation-bound provisioning model, you must specify the following configurations:

  • The compute instance and the reservation must have matching properties. You can only use your reserved capacity to create instances if the instance and auto-created reservation properties exactly match. For more information, see the requirements for consuming reservations.

  • The compute instance must specifically target the reservation for consumption. When you create a compute instance, you must specify the name of the auto-created reservation to target for consumption, as well as set the reservationAffinity field to SPECIFIC_RESERVATION. For more information, see Consume a specifically targeted reservation.

  • The compute instance must use the reservation-bound provisioning model. When you create a compute instance, you must specify the reservation-bound provisioning model as follows:

    • If you use the Google Cloud console, then, in the Provisioning model list, select Reservation-bound.

    • If you use the Google Cloud CLI, then include the --provisioning-model=RESERVATION_BOUND flag in the command.

    • If you use the Compute Engine API, then include the "provisioningModel": "RESERVATION_BOUND" field in the request body.

  • The compute instance must be stopped or deleted at the reservation end time. When you create a compute instance, you must specify whether to stop or delete the compute instance at the reservation's end time by using the instanceTerminationAction field. For more information, see how to limit the run time of a compute instance.

After you create a compute instance by using the reservation-bound provisioning model, the compute instance starts running and keeps running until you stop or delete it, or until the Compute Engine stops or deletes the compute instance at the reservation's end time.

Quota

When you create a compute instance by using the reservation-bound provisioning model, you don't need quota for the reserved resources that you use to create the compute instance. You only need quota for the resources that aren't part of your reserved capacity, such as disks and IP addresses. For more information about the different types of quota, see Allocation quotas.

Pricing

When you create a compute instance by using the reservation-bound provisioning model, you incur charges as follows:

  • Charges start when you create the compute instance. You don't incur additional charges for the reserved resources that you use to create your compute instance. You only incur charges for the resources that aren't part of the reservation, such as disks or IP addresses. For more information, see the billing for reservations.

  • Charges stop at the reservation's end time. At that time, Compute Engine deletes the reservation, and stops or deletes your compute instance based on the termination action that is specified in the compute instance.

Limitations

To create compute instances by using the reservation-bound provisioning model, you must use one of the following machine series:

  • A4X Max

  • A4X

  • A4

  • A3 Ultra

  • A3 Mega

  • A3 High with 8 GPUs

  • H4D

To inquire about using other accelerator-optimized machine series with the reservation-bound provisioning model, contact your account team or the sales team.

What's next

Try it for yourself

If you're new to Google Cloud, create an account to evaluate how Compute Engine performs in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

Try Compute Engine free