Configure Cloud Run services

This page provides an overview of configuration options for Cloud Run services. These configurations are listed in the order that they appear in the Google Cloud console when you are deploying a new service.

Configure service-level settings

Configure service-level settings, such as billing and scaling settings.

Billing

Use billing settings to control how you are charged, either per request and only when the instance processes a request, or for the entire lifecycle of the instance.

Service scaling

You can set your service to autoscaling or manual scaling, depending on how much control you need over your scaling behavior.

When using autoscaling, each Cloud Run revision is automatically scaled to the number of instances needed to handle all incoming requests, events, or CPU utilization. You can control how many instances your Cloud Run service creates to serve requests by setting maximum instances and minimum instances. You can avoid cold starts for your application and reduce application latency by setting a minimum number of instances. Setting a maximum number of instances can help to curb costs and guard against abnormally high request levels.

Manual scaling lets you set a specific instance count, regardless of traffic or utilization, and without requiring redeployment. By default, Cloud Run automatically scales out to a specified or default maximum number of instances. However, for some use cases, you might want the ability to set a specific number of instances.

Containers: Setting

Customize your service by configuring the capacity, GPU, health checks, timeouts, and the execution environment.

Capacity

You can control the amount of memory and CPU a service can use.

GPU

If you need to host AI workloads, such as inference models and model training, you can configure Cloud Run services with or without GPU.

Health checks

Cloud Run lets you configure two types of health check probes. One of the probes determines when the containers is ready to accept traffic, and the other probe determines whether to restart the container. Learn more about container health checks.

Timeouts

You can set a Cloud Run request timeout that specifies the time within which a response must be returned.

Maximum concurrency

You can configure the maximum concurrent requests per instance. You can increase this to a maximum of 1000.

Execution environment

Cloud Run has two execution environments. Learn about the differences between both execution environments.

Containers: Variables & Secrets

Configure environment variables and secrets to securely manage your service.

Environment variables

You can create key-value pairs for use with your Cloud Run service. See Configure environment variables for services to learn more.

Secrets

You can use Secret Manager with your Cloud Run to securely store API keys, passwords, and other sensitive information. See Configure secrets to learn more.

Containers: Volumes mounts

Cloud Run volume mounts lets you access shared data stored in a local file system, such as a storage bucket or file server content, from your container. You can mount a Cloud Storage bucket, an NFS share like a Filestore instance, or an in-memory filesystem provided by Cloud Run.

Networking: Traffic splitting

Each time you deploy or redeploy a service, a new revision of the underlying Cloud Run service is automatically created. See Session affinity and traffic splitting for more details.

Security: Service identity

The Cloud Run service identity is the service account that is used as the authenticated account for accessing Google Cloud APIs from your Cloud Run instance container. We recommend that you create a service account and determine the most minimal set of permissions that the service account needs to access specific Google Cloud resources.

Postdeployment

Once your service is successfully deploy, you can continue configuring your service to meet your needs.

Labels

Cloud Run labels are key/value pairs that you can apply to Cloud Run services, revisions, and Cloud Run functions. Labels help you organize your Cloud Run resources, and manage costs at scale with the granularity you need.

Labels you previously set for your Cloud Run functions using either gcloud functions commands or the Cloud Functions v2 API propagate to Cloud Run when you deploy your functions in Cloud Run.

Recommendations

See Optimize with Recommender to learn the optimizations provided by Recommender on Cloud Run.

Tag services

Tags are key-value pairs you can apply to your resources for fine-grained access control using Cloud Run console.

Tag administrators create tags for resources across Google Cloud at the organization or project level. Tags provides a way to conditionally allow or deny policies based on whether a resource has a specific tag. To learn more, see Tag services.