Scale stateless workloads

Shared Standard

This document describes how to scale existing stateless workloads running in a Google Distributed Cloud (GDC) air-gapped Kubernetes cluster. You must scale the pods running in your stateless workloads as your container workload requirements evolve.

This document is for developers within the application operator group who are responsible for managing application workloads for their organization. For more information, see Audiences for GDC air-gapped documentation.

Before you begin

To complete the tasks in this document, you must have the following resources and roles:

To run commands against a Kubernetes cluster, make sure you have the following resources:
- Locate the Kubernetes cluster name, or ask a member of the platform administrator group what the cluster name is.
- Sign in and generate the kubeconfig file for the Kubernetes cluster if you don't have one.
- Use the kubeconfig path of the Kubernetes cluster to replace KUBERNETES_CLUSTER_KUBECONFIG in these instructions.
To get the required permissions to scale stateless workloads in a shared cluster, ask your Organization IAM Admin to grant you the Namespace Admin role (namespace-admin) in your project namespace.
To get the required permissions to scale stateless workloads in a standard cluster, ask your Organization IAM Admin to grant you the Cluster Developer role (cluster-developer) in a standard cluster.

Scale a deployment

Use the scaling functionality of Kubernetes to appropriately scale the amount of pods running in your deployment.

Autoscale the pods of a deployment

Kubernetes offers autoscaling to remove the need of manually updating your deployment when demand evolves. Complete the following steps to autoscale the pods of your deployment:

To ensure the horizontal pod autoscaler can appropriately measure the CPU percentage, set the CPU resource request on your deployment.
Set the horizontal pod autoscaler in your deployment:
```
kubectl --kubeconfig KUBERNETES_CLUSTER_KUBECONFIG \
    -n NAMESPACE \
    autoscale deployment DEPLOYMENT_NAME \
    --cpu-percent=CPU_PERCENT \
    --min=MIN_NUMBER_REPLICAS \
    --max=MAX_NUMBER_REPLICAS
```
Replace the following:
- KUBERNETES_CLUSTER_KUBECONFIG: the kubeconfig file for the cluster.
- NAMESPACE: the namespace. For shared clusters, this must be a project namespace. For standard clusters, it can be any namespace.
- DEPLOYMENT_NAME: the name of the deployment to autoscale.
- CPU_PERCENT: the target average CPU utilization to request, represented as a percentage, over all the pods.
- MIN_NUMBER_REPLICAS: the lower limit for the number of pods the autoscaler can provision.
- MAX_NUMBER_REPLICAS: the upper limit for the number of pods the autoscaler can provision.

Check the current status of the horizontal pod autoscaler:

kubectl get hpa

The output is similar to the following:

NAME              REFERENCE                          TARGET    MINPODS   MAXPODS   REPLICAS   AGE
DEPLOYMENT_NAME   Deployment/DEPLOYMENT_NAME/scale   0% / 50%  1         10        1          18s

Manually scale the pods of a deployment

If you prefer to manually scale a deployment, run:

kubectl --kubeconfig KUBERNETES_CLUSTER_KUBECONFIG \
    -n NAMESPACE \
    scale deployment DEPLOYMENT_NAME \
    --replicas NUMBER_OF_REPLICAS

Replace the following:

KUBERNETES_CLUSTER_KUBECONFIG: the kubeconfig file for the cluster.
NAMESPACE: the namespace. For shared clusters, this must be a project namespace. For standard clusters, it can be any namespace.
DEPLOYMENT_NAME: the name of the deployment in which to autoscale.
DEPLOYMENT_NAME: the number of replicated Pod objects in the deployment.

Scale stateless workloads Stay organized with collections Save and categorize content based on your preferences.

Before you begin

Scale a deployment

Autoscale the pods of a deployment

Manually scale the pods of a deployment

What's next

Scale stateless workloads