Reserve capacity through your account team

This document explains how to reserve capacity for creating virtual machine (VM) instances with GPUs attached in compute nodes. You do so by using future reservations for blocks of capacity. To learn about all the options to obtain capacity in Cluster Director, see Capacity overview.

To help ensure that you have the resources to create GPU VMs in a cluster partition, you must do the following:

  1. Request a future reservation from Google. This action lets you reserve blocks of capacity for a defined duration, starting on a specific date and time that you choose.

  2. Review the draft request created by Google. Based on your request, Google creates a future reservation request. You can then review the request and, if needed, contact your account team to make changes.

  3. You submit the request. After you submit the request, Google Cloud approves it within a few minutes. Then, Compute Engine automatically creates (auto-creates) an empty reservation.

  4. At your request start time, Compute Engine delivers the reserved resources. Compute Engine provisions your requested capacity into the auto-created reservation. You can then use the reservation to create GPU VMs in your cluster until the reservation period ends.

Limitations

This section describes the limitations for future reservation requests, and for the auto-created reservations for a request.

Limitations for future reservation requests

After Google creates a draft future reservation request for you, the following limitations apply:

  • You can't modify the request details, including the share type.

  • After the request is submitted, approved, and its state changes to PROVISIONING, you can no longer cancel or delete the request. You commit to pay for the requested capacity from the request's start time, regardless of usage.

Limitations for auto-created reservations

After Compute Engine creates an on-demand reservation to fulfill your requested capacity, the following limitations apply:

  • You can only use the reservation after the request start time.

  • You can't manually modify the reservation. For your available options, contact your account team.

  • You can't manually delete the reservation. When you reserve capacity, if you specify that you don't want to automatically delete the reservation at the end of its reservation period, you must contact your account team to delete the reservation.

Before you begin

Select the tab for how you plan to use the samples on this page:

Console

When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.

gcloud

In the Google Cloud console, activate Cloud Shell.

Activate Cloud Shell

At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

REST

To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.

    Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command:

    gcloud init

    If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

For more information, see Authenticate for using REST in the Google Cloud authentication documentation.

Required roles

To get the permissions that you need to create a future reservation request, ask your administrator to grant you the Compute Future Reservation User (roles/compute.futureReservationUser) IAM role on the project. For more information about granting roles, see Manage access to projects, folders, and organizations.

This predefined role contains the permissions required to create a future reservation request. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to create a future reservation request:

  • To let Compute Engine auto-create reservations: compute.reservations.create on the project
  • To create a future reservation request: compute.futureReservations.create on the project
  • To specify an instance template: compute.instanceTemplates.useReadOnly on the instance template

You might also be able to get these permissions with custom roles or other predefined roles.

Quota

As part of the future reservation request process, Google manages quota for your reserved resources. You don't need to request quota. At the start time of your approved future reservation, Google automatically increases your quota if you lack it for the reserved resources.

Request capacity through your account team

Contact your account team and provide the following information for Google to create a draft future reservation request:

  • Project number: the number of the project where your account team creates the request and Compute Engine provisions the capacity.

  • Machine type: the machine type to reserve. You can specify one of the following:

    • A4X (a4x-highgpu-4g)

    • A4 (a4-highgpu-8g)

    • A3 Ultra (a3-ultragpu-8g)

    • A3 Mega (a3-megagpu-8g)

  • Zone: the zone where you want to reserve capacity. To review the available regions and zones for a GPU machine type, see GPU locations.

  • Total count: the total number of VMs to reserve. You can only reserve multiples of two VMs. Block sizes and VM count per block vary based on machine type and availability. Your account team can provide more details for your request.

  • Start time: the start time of the reservation period. You can start using the reserved capacity at that time. Format the start time as a RFC 3339 timestamp as follows:

    YYYY-MM-DDTHH:MM:SSOFFSET
    

    Replace the following:

    • YYYY-MM-DD: a date formatted as a four-digit year, two-digit month, and a two-digit day of the month, separated by hyphens (-). For example, to specify December 31, 2025, use 2025-12-31.

    • HH:MM:SS: a time formatted as two-digit hours (24-hour time), two-digit minutes, and two-digit seconds, separated by colons (:).

    • OFFSET: the time zone formatted as an offset of Coordinated Universal Time (UTC). For example, to use the Pacific Standard Time (PST), specify -08:00. To use no offset, specify Z.

  • End time: the end time of the reservation period. Format it as an RFC 3339 timestamp. At that time, Compute Engine does the following:

    • Compute Engine deletes the auto-created reservation.

    • Based on the termination action that you specify for your VMs, Compute Engine stops or deletes any VMs that you created by using the auto-created reservation.

  • Reservation name: the name of the reservation that Compute Engine creates to deliver your reserved capacity. Compute Engine can only create specifically targeted reservations.

  • Reservation automatic deletion: whether you want Compute Engine to automatically delete the auto-created reservation at the end of the reservation period, or at a later time. If you want to manually delete the reservation, then you must contact your account team to delete the reservation.

  • Share type: whether only your project can use the auto-created reservation (LOCAL), or other projects can use the reservation (SPECIFIC_PROJECTS). This property can't change after you submit the request. To share reserved capacity with other projects in your organization, do the following:

    1. If you haven't already, then verify that the project where Google creates the request is allowed to create shared reservations.

    2. Provide the numbers of the projects to share the reserved capacity with. You can specify up to 100 projects in your organization.

  • Commitment name: if your reservation period is one year or longer, then you must purchase and attach a resource-based commitment to your reserved resources. You can purchase a commitment with a 1-year or 3-year plan. If you share the reserved capacity with other projects, then those projects get discounts only if they use the same Cloud Billing account as the project where you reserve capacity. For details, see Enable CUD sharing for resource-based commitments.

When Google creates the draft future reservation request, your account team contacts you.

Review and submit a draft reservation request

After you provide the type and amount of resources to reserve to your account team, Google creates a draft future reservation request. You can review the draft request and, if correct, submit it for review. You must submit the request before the request start time.

To review and submit a draft future reservation request, select one of the following options:

Console

  1. In the Google Cloud console, go to the Reservations page.

    Go to Reservations

  2. Click the Future reservations tab. The Future Reservations table lists each future reservation request in your project, and each table column describes a property.

  3. In the Name column, click the name of the draft request that Google created for you. A page that gives the details of the future reservation request opens.

  4. In the Basic information section, verify that the request details, such as Dates and Share type, are correct. Also, if you requested a commitment, verify that it's specified. If any of these details are incorrect, then contact your account team.

  5. If everything looks accurate, then submit your request:

    1. Click Edit draft. A page to modify the draft request appears.

    2. Click Create. The Reservation page appears. Google Cloud approves your request within a few minutes, and then Compute Engine creates an empty reservation with your requested resources.

gcloud

  1. To view a list of future reservation requests in your project, use the gcloud compute future-reservations list command with the --filter flag set to PROCUREMENT_STATUS=DRAFTING:

    gcloud compute future-reservations list --filter=PROCUREMENT_STATUS=DRAFTING
    
  2. In the command output, look for the reservation request that has the name that you provided to your account team.

  3. To view the details of the draft request, use the gcloud compute future-reservations describe command.

    Select and run one of the following commands:

    Bash

    gcloud compute future-reservations describe FUTURE_RESERVATION_NAME \
        --zone=ZONE
    

    Powershell

    gcloud compute future-reservations describe FUTURE_RESERVATION_NAME `
        --zone=ZONE
    

    cmd.exe

    gcloud compute future-reservations describe FUTURE_RESERVATION_NAME ^
        --zone=ZONE
    

    Replace the following:

    • FUTURE_RESERVATION_NAME: the name of the draft future reservation request.

    • ZONE: the zone where Google created the request.

    The output is similar to the following:

    autoCreatedReservationsDeleteTime: '2026-02-10T19:20:00Z'
    creationTimestamp: '2025-11-27T11:14:58.305-08:00'
    deploymentType: DENSE
    id: '7979651787097007552'
    kind: compute#futureReservation
    name: example-draft-request
    planningStatus: DRAFT
    reservationName: example-reservation
    schedulingType: INDEPENDENT
    selfLink: https://www.googleapis.com/compute/v1/projects/example-project/zones/europe-west1-b/futureReservations/example-draft-request
    selfLinkWithId: https://www.googleapis.com/compute/v1/projects/example-project/zones/europe-west1-b/futureReservations/7979651787097007552
    specificReservationRequired: true
    specificSkuProperties:
      instanceProperties:
        guestAccelerators:
        - acceleratorCount: 8
          acceleratorType: nvidia-h200-141gb
        localSsds:
        - diskSizeGb: '375'
          interface: NVME
        ...
      machineType: a3-ultragpu-8g
    totalCount: '2'
    status:
      autoCreatedReservations:
      - https://www.googleapis.com/compute/v1/projects/example-project/zones/europe-west1-b/reservations/example-reservation
      fulfilledCount: '2'
      lockTime: '2026-01-27T19:15:00Z'
      procurementStatus: DRAFTING
    timeWindow:
      endTime: '2026-02-10T19:20:00Z'
      startTime: '2026-01-27T19:20:00Z'
    zone: https://www.googleapis.com/compute/v1/projects/example-project/zones/europe-west1-b
    
  4. In the command output, verify that the request details, such as the reservation period and share type, are correct. Additionally, if you purchased a commitment, verify that it's specified. If the details are incorrect, then contact your account team.

  5. To submit the draft request for review, use the gcloud compute future-reservations update command with the --planning-status flag set to SUBMITTED.

    Select and run one of the following commands:

    Bash

    gcloud compute future-reservations update FUTURE_RESERVATION_NAME \
        --planning-status=SUBMITTED \
        --zone=ZONE
    

    Powershell

    gcloud compute future-reservations update FUTURE_RESERVATION_NAME `
        --planning-status=SUBMITTED `
        --zone=ZONE
    

    cmd.exe

    gcloud compute future-reservations update FUTURE_RESERVATION_NAME ^
        --planning-status=SUBMITTED ^
        --zone=ZONE
    

    Within a few minutes, Google Cloud approves your request, and then Compute Engine creates an empty reservation with your requested resources.

REST

  1. To view a list of future reservation requests in your project, make a GET request to the futureReservations.list method.

    Your request must include the following HTTP method and request URL:

    GET https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/futureReservations?filter=status.procurementStatus=DRAFTING
    

    Replace the following:

    • PROJECT_ID: the ID of the project where Google created the draft future reservation request.

    • ZONE: the zone where Google created the request.

    To send your request, select one of the following options:

    curl (Bash)

    curl -X GET \
         -H "Authorization: Bearer $(gcloud auth print-access-token)" \
         "https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/futureReservations?filter=status.procurementStatus=DRAFTING"
    

    Powershell

    $cred = gcloud auth print-access-token
    $headers = @{ "Authorization" = "Bearer $cred" }
    
    Invoke-WebRequest `
        -Method GET `
        -Headers $headers `
        -Uri "https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/futureReservations?filter=status.procurementStatus=DRAFTING" | Select-Object -Expand Content
    

    curl (cmd.exe)

    curl -X GET ^
         -H "Authorization: Bearer $(gcloud auth print-access-token)" ^
         "https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/futureReservations?filter=status.procurementStatus=DRAFTING"
    
  2. In the request output, look for the reservation request that has the name that you provided to your account team.

  3. To view the details of the draft request, make a GET request to the futureReservations.get method.

    Your request must include the following HTTP method and request URL:

    GET https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/futureReservations/FUTURE_RESERVATION_NAME
    

    Replace FUTURE_RESERVATION_NAME with the name of the draft future reservation request.

    To send your request, select one of the following options:

    curl (Bash)

    curl -X GET \
         -H "Authorization: Bearer $(gcloud auth print-access-token)" \
         "https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/futureReservations/FUTURE_RESERVATION_NAME"
    

    Powershell

    $cred = gcloud auth print-access-token
    $headers = @{ "Authorization" = "Bearer $cred" }
    
    Invoke-WebRequest `
        -Method GET `
        -Headers $headers `
        -Uri "https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/futureReservations/FUTURE_RESERVATION_NAME" | Select-Object -Expand Content
    

    curl (cmd.exe)

    curl -X GET ^
         -H "Authorization: Bearer $(gcloud auth print-access-token)" ^
         "https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/futureReservations/FUTURE_RESERVATION_NAME"
    

    The response is similar to the following:

    {
      "specificSkuProperties": {
        "instanceProperties": {
          "machineType": "a3-ultragpu-8g",
          "guestAccelerators": [
            {
              "acceleratorType": "nvidia-h200-141gb",
              "acceleratorCount": 8
            }
          ],
          "localSsds": [
            {
              "diskSizeGb": "375",
              "interface": "NVME"
            },
            ...
          ]
        },
        "totalCount": "2"
      },
      "kind": "compute#futureReservation",
      "id": "7979651787097007552",
      "creationTimestamp": "2025-11-27T11:14:58.305-08:00",
      "selfLink": "https://www.googleapis.com/compute/v1/projects/example-project/zones/europe-west1-b/futureReservations/example-draft-request",
      "selfLinkWithId": "https://www.googleapis.com/compute/v1/projects/example-project/zones/europe-west1-b/futureReservations/7979651787097007552",
      "zone": "https://www.googleapis.com/compute/v1/projects/example-project/zones/europe-west1-b",
      "name": "example-draft-request",
      "timeWindow": {
        "startTime": "2026-01-27T19:20:00Z",
        "endTime": "2026-02-10T19:20:00Z"
      },
      "status": {
        "procurementStatus": "DRAFTING",
        "lockTime": "2026-01-27T19:15:00Z"
      },
      "planningStatus": "DRAFT",
      "specificReservationRequired": true,
      "reservationName": "example-reservation",
      "deploymentType": "DENSE",
      "schedulingType": "INDEPENDENT",
      "autoCreatedReservationsDeleteTime": "2026-02-10T19:20:00Z"
    }
    
  4. In the output, verify that the request details, such as the reservation period and share type, are correct. Additionally, if you requested a commitment, verify that it's specified. If the details are incorrect, then contact your account team.

  5. To submit the draft request for review, make a PATCH request to the futureReservations.update method.

    Your request must include the following HTTP method and request URL:

    PATCH https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/futureReservations/FUTURE_RESERVATION_NAME?updateMask=planningStatus
    

    In the request body, include the following:

    {
      "name": "FUTURE_RESERVATION_NAME",
      "planningStatus": "SUBMITTED"
    }
    

    Save the request body in a file named request.json. Then, to send your request, select one of the following options:

    curl (Bash)

    curl -X PATCH \
         -H "Authorization: Bearer $(gcloud auth print-access-token)" \
         -H "Content-Type: application/json; charset=utf-8" \
         -d @request.json \
         "https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/futureReservations/FUTURE_RESERVATION_NAME?updateMask=planningStatus"
    

    Powershell

    $cred = gcloud auth print-access-token
    $headers = @{ "Authorization" = "Bearer $cred" }
    
    Invoke-WebRequest `
        -Method PATCH `
        -Headers $headers `
        -ContentType: "application/json; charset=utf-8" `
        -InFile request.json `
        -Uri "https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/futureReservations/FUTURE_RESERVATION_NAME?updateMask=planningStatus" | Select-Object -Expand Content
    

    curl (cmd.exe)

    curl -X PATCH ^
         -H "Authorization: Bearer $(gcloud auth print-access-token)" ^
         -H "Content-Type: application/json; charset=utf-8" ^
         -d @request.json ^
         "https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/futureReservations/FUTURE_RESERVATION_NAME?updateMask=planningStatus"
    

    Within a few minutes, Google Cloud approves your request, and then Compute Engine creates an empty reservation with your requested resources.

What's next