Error code 429

If the number of your requests exceeds the capacity allocated to process requests, then error code 429 is returned. The following table displays the error message generated by each type of quota framework:

Quota framework	Message
Pay-as-you-go	`Resource exhausted, please try again later.`
Provisioned Throughput	`Too many requests. Exceeded the Provisioned Throughput.`

With a Provisioned Throughput (PT) subscription, you can reserve an amount of throughput for specific generative AI models. If you don't have a PT subscription and resources aren't available to your application, then an error code 429 is returned. Although you don't have reserved capacity, you can try your request again. However, the request isn't counted against your error rate as described in your service level agreement (SLA).

For projects that have purchased PT, Gemini Enterprise Agent Platform measures a project's throughput and reserves the purchased amount of throughput for the project's actual usage.

For standard PT, when you use less than your purchased amount, errors that might otherwise be 429 are returned as 5XX and count toward the SLA error rate. For Single Zone PT, when you use less than your purchased amount, capacity-related 429 errors are treated as 5XX but don't count toward the SLA error rate. When you exceed your purchased amount, the additional requests are processed on-demand as pay-as-you-go.

Pay-as-you-go

On the pay-as-you-go quota framework, you have the following options for resolving 429 errors:

Use the global endpoint instead of a regional endpoint whenever possible.
Implement a retry strategy by using truncated exponential backoff.
If your model uses quotas, you can submit a Quota Increase Request (QIR). If your model uses Standard pay-as-you-go, smoothing traffic and reducing large spikes can help.
Subscribe to PT for a more consistent level of service. For more information, see PT.

PT

To correct the 429 error generated by PT, do the following:

Use the Default behavior example, which doesn't set a header in prediction requests. Any overages are processed on-demand and billed as pay-as-you-go.
Increase the number of GSUs in your PT subscription.

What's next

To learn more about Standard pay-as-you-go, see Standard pay-as-you-go.
To learn more about PT, see Provisioned Throughput.
To learn about quotas and limits for Agent Platform, see Agent Platform quotas and limits.
To learn more about Google Cloud quotas and system limits, see the Cloud Quotas documentation.

Error code 429 Stay organized with collections Save and categorize content based on your preferences.

Pay-as-you-go

PT

What's next

Error code 429