This document lists the quotas and system limits that apply to Agent Registry.
- Quotas have default values, but you can typically request adjustments.
- System limits are fixed values that can't be changed.
Google Cloud uses quotas to help ensure fairness and reduce spikes in resource use and availability. A quota restricts how much of a Google Cloud resource your Google Cloud project can use. Quotas apply to a range of resource types, including hardware, software, and network components. For example, quotas can restrict the number of API calls to a service, the number of load balancers used concurrently by your project, or the number of projects that you can create. Quotas protect the community of Google Cloud users by preventing the overloading of services. Quotas also help you to manage your own Google Cloud resources.
The Cloud Quotas system does the following:
- Monitors your consumption of Google Cloud products and services
- Restricts your consumption of those resources
- Provides a way to request changes to the quota value and automate quota adjustments
In most cases, when you attempt to consume more of a resource than its quota allows, the system blocks access to the resource, and the task that you're trying to perform fails.
Quotas generally apply at the Google Cloud project level. Your use of a resource in one project doesn't affect your available quota in another project. Within a Google Cloud project, quotas are shared across all applications and IP addresses.
For more information, see the Cloud Quotas overview.
Allocation quotas
Allocation quotas restrict the amount of a specific resource that Agent Registry lets you use at a given time. When you create a resource, your available quota for that resource decreases. When you delete the resource, the quota is restored.
The following table lists the allocation quotas that apply to Agent Registry and the default value for each quota.
| Quota | Value |
|---|---|
| Maximum agents per project | 100 (Global and per region) |
| Maximum MCP servers per project | 100 (Global and per region) |
| Maximum endpoints per project | 100 (Global and per region) |
| Maximum bindings per project | 100 (Global and per region) |
| Maximum skills per project | 100 (Global and per region) |
Rate quotas
Rate quotas restrict the rate at which you can consume a resource. Per-minute rate quotas reset every minute. The following table lists the rate quotas that apply to Agent Registry and the default value for each quota.
| Quota | Value |
|---|---|
| Aggregate API requests across global and regional methods | 12,000 per minute per project (200 QPS) |
| Global API requests | 1,200 per minute per project (20 QPS) |
| Regional API requests | 1,200 per minute per project per region (20 QPS) |
System limits
The following table lists the system limits that apply to Agent Registry and the value for each system limit.
| System limit | Value |
|---|---|
| Display name length | 63 characters |
| Description length | 2,048 characters |
| Agent specification content size | 10 KB |
| MCP server specification content size | 10 KB |
| Maximum payload size for creating skills or skill revisions | 2 MB |
| Maximum skills or tools per service | 100 |
| Maximum pagination size | 100 |
Get quota and system limit metric names
Quotas and system limits have two types of names: display names and metric names. Display names have spaces and capitalization that make them easier for humans to read. Metric names are more likely to be lowercase and delimited by underscores instead of spaces; the exact format depends on the service.
The following instructions show how to get metric names for quotas and system limits by using either the Google Cloud console or the gcloud CLI.
Console
In the Google Cloud console, go to the IAM & Admin > Quotas & System Limits page:
The table on this page displays quotas and system limits that have usage or have adjusted values, and a reference entry for other quotas. The reference entry has the word "default" in parentheses at the end of the listing in the Name column. For example,
SetIAMPolicy requests per minute per region (default)is the reference entry for the quotaSetIamPolicyRequestsPerMinutePerProject.If you don't see the Metric column, take the following steps.
- Click Column display options.
- Select Metric.
- Click OK. The Metric column appears in the table.
The Metric column shows the metric names. To filter the results, enter a property name or value in the field next to Filter.
gcloud
To get the metric names for a Google Cloud service by
using the gcloud CLI, run the quotas info list
command. To skip lines that don't list metric names, pass the output to a
command such as grep with metric: as the search term, or use the
gcloud CLI
--format flag:
gcloud beta quotas info list --project=PROJECT_ID_OR_NUMBER \
--service=SERVICE_NAME --format="value(metric)"
Replace the following:
PROJECT_ID_OR_NUMBER: the project ID or project number.SERVICE_NAME: the name of the service whose quota metrics you want to see—for example, the service name for Compute Engine iscompute.googleapis.com. Include thegoogleapis.comportion of the service name.
Request a quota adjustment
To adjust most quotas, use the Google Cloud console. For more information, see Request a quota adjustment.