GPU quota

To create Confidential VM instances with attached NVIDIA GPUs, you must have sufficient GPU quota in your Google Cloud project for the specific GPU type, region, and instance provisioning model that you want to use.

NVIDIA RTX PRO 6000

Confidential VM instances that are based on the G4 machine type use NVIDIA RTX PRO 6000 GPUs and consume GPU quotas based on the instance provisioning model:

  • Standard (on-demand) VM instances: Consume the GPU_FAMILY:NVIDIA_RTX_PRO_6000 quota.

  • Spot VM and Flex-start VM instances: Consume the PREEMPTIBLE_NVIDIA_RTX_PRO_6000_GPUS quota.

Quota requests for NVIDIA RTX PRO 6000 GPUs follow the Google Cloud process. For more information, see View and manage quotas.

NVIDIA H100

Confidential VM instances that are based on the A3 High machine type use NVIDIA H100 GPUs. You need sufficient quota in the following quota types to create a Confidential VM instance with GPU successfully:

  • Preemptible quota for the GPU models that you want to create in each region.

  • Global quota for the total number of GPUs of all types in all regions.

To request an increase to these GPU quotas, see Request preemptible quota and Request global quota.

Request preemptible quota

To request a regional preemptible NVIDIA H100 GPUs quota increase, do the following:

  1. In the Google Cloud console, go to the Quotas page.

    Go to Quotas

  2. In the Filter box, enter PREEMPTIBLE_NVIDIA_H100_GPUS, and then press the Enter or Return key.

  3. In the Dimensions column of the table, find the row with the region whose quota you want to increase.

  4. In that row, click More actions, and then click Edit quota.

  5. In the Quota changes pane, enter the number of GPUs you want in the New value box.

  6. Click Submit request.

Request global quota

To request a global quota increase, do the following:

  1. In the Google Cloud console, go to the Quotas page.

    Go to Quotas

  2. In the Filter box, enter GPUS_ALL_REGIONS, and then press the Enter or Return key.

  3. In the resulting row, click More actions, and then click Edit quota.

  4. In the Quota changes pane, enter the number of GPUs you want in the New value box.

  5. Click Submit request.

What happens after a quota request

If your quota request is successful, you're sent an approval email. Wait about 15 minutes after you receive the email, and then refresh the Quotas page to check for the updated quota. If the quota still hasn't been updated after 15 minutes, contact Cloud Customer Care.

If your quota request is denied, you might receive an email explaining the next steps you can take. To reapply for more quota, follow the instructions in the email.