Sequence the rollout of cluster upgrades with custom stages

This document shows you how to manage Google Kubernetes Engine (GKE) cluster upgrades that use rollout sequencing with custom stages. You create a rollout sequence by using groups of clusters organized into fleets and optionally, subsets of clusters from those fleets. You can choose how much soak testing time you want after cluster upgrades are complete in a group (maximum 30 days). You can include both Autopilot and Standard clusters. For more information about how this feature works, see About rollout sequencing with custom stages.

If you manage rollouts across multiple fleets, we recommend that you use a dedicated project to host your RolloutSequence objects. This project acts as the parent and coordinator for the rollouts across the sequence. This project is typically not part of the sequence; that is, the project doesn't contain fleets or clusters which are part of the sequence.

Before you begin

  • Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command:

    gcloud init

    If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

  • Ensure that you have existing Autopilot or Standard clusters. To create a new cluster, see Create an Autopilot cluster.
  • (Optional) If you don't already have a dedicated Google Cloud project to host your RolloutSequence configuration, create one by using the Google Cloud console or another method.

  • Ensure that you have enabled the required APIs for fleets. These APIs must be enabled in your fleet host projects to create any type of rollout sequence. For your rollout sequence host project, enable the gkehub.googleapis.com API.

Required roles

To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

To create or modify a rollout sequence, you must have the Fleet Editor IAM role (roles/gkehub.editor) on each fleet host project in the rollout sequence and your rollout sequence host project. This role provides the following permissions:

  • gkehub.rolloutsequences.create
  • gkehub.rolloutsequences.get
  • gkehub.rolloutsequences.list
  • gkehub.rolloutsequences.update
  • gkehub.rolloutsequences.delete
  • gkehub.fleet.get

These permissions let you create, access, and modify RolloutSequence objects, and use fleets in the rollout sequence.

If you need to register or unregister clusters to a fleet, you need all of the following permissions:

For more information about the least-privileged IAM roles required for different tasks, see Get predefined role suggestions with Gemini assistance.

Configure a rollout sequence

To create a rollout sequence, your clusters must be organized into groups of fleets. You can also create granular stages that can target specific subsets of clusters within a fleet by using Kubernetes labels. For guidance on how to organize your clusters, see the Community bank example. After you organize the clusters into groups and optionally label them, you create a rollout sequence by defining the ordered list of stages and the soak time for each group.

Organize clusters into fleets

In a rollout sequence, we recommend enrolling all clusters in the same release channel. If the clusters are not enrolled in the same channel, GKE selects a version from the most conservative channel in the sequence. For example, if clusters are enrolled in both the Stable and Regular channels, GKE chooses the version from the Stable channel. We also recommend that all clusters run the same minor version in order to be eligible for the same auto-upgrade target version.

If you have already organized your clusters into fleets, you can skip the following steps and proceed to the Create subsets of clusters section.

  1. Group your clusters into fleets. You can organize your clusters by deployment environments such as Testing, Staging, and Production (recommended).
  2. Register each cluster with a fleet based on your chosen grouping.

Create subsets of clusters (optional)

To make a stage in your rollout sequence target specific clusters, you must label those clusters.

For example, to test a new version on a small subset of clusters before a full rollout, you might apply a canary label to those clusters. To add the canary label with value true to a cluster using the Google Cloud CLI, run the following command:

gcloud container clusters update CLUSTER_NAME \
    --location=CLUSTER_LOCATION\
    --update-labels=canary=true

Replace the following:

The --update-labels=canary=true flag instructs GKE to apply the canary label to the cluster .

For more information about adding a label to a cluster, see Add or update labels for existing clusters.

Create a rollout sequence with custom stages

A rollout sequence defines the order of upgrades by using stages. To create a rollout sequence, you must first create a YAML file that defines the stages, and then create a RolloutSequence.

To ensure that the sequence captures all clusters, each fleet must include a catch-all stage (a stage without a label selector). This catch-all stage captures all remaining clusters that GKE didn't select in earlier stages. If you assign a single cluster to multiple stages within one RolloutSequence, to resolve conflicts, GKE implicitly assigns the cluster only to the earliest stage.

The following example configuration creates three stages:

  • The first stage targets all clusters in the dev fleet. After the upgrade is complete, there is a soak time of 7 days (a value of 604800 seconds).
  • The second stage targets clusters in the prod fleet that have the label canary=true. After the upgrade is complete, there is a soak time of 7 days.
  • The third stage targets the remaining clusters in the prod fleet. After the upgrade is complete, there is a soak time of 7 days.
  1. Save the following manifest as rollout-sequence.yaml:

    - stage:
      fleet-projects:
      - projects/dev
      soak-duration: 604800s
    - stage:
      fleet-projects:
      - projects/prod
      soak-duration: 604800s
      label-selector: resource.labels.canary=='true'
    - stage:
      fleet-projects:
      - projects/prod
      soak-duration: 604800s
    

    Note the following:

    • stage: includes a fleet or a subset of clusters within a fleet. Clusters in earlier stages must be fully upgraded and soaked before the sequence proceeds to the next stage. However, if a cluster has not finished upgrading 30 days after the upgrade process begins, GKE begins the soaking period.
    • fleet-projects: a list of fleets from which to select clusters for this stage. A maximum of one fleet can be referenced per stage. A fleet is identified by the project where the fleet is hosted. This project can be a different project from the project where clusters live, if the fleet contains cross-project memberships. The format to specify a fleet project is projects/PROJECT_ID.
    • label-selector (optional): selects a subset of clusters from the specified fleets. This field uses the Common Expression Language (CEL) syntax and must begin with resource.labels.
    • soak-duration: the time to wait after upgrading all clusters in a preceding stage before proceeding to the next stage. Expressed in seconds.
  2. To create the rollout sequence that you defined in the rollout-sequence.yaml manifest, run the following command:

    gcloud beta container fleet rolloutsequences create ROLLOUT_SEQUENCE_NAME \
        --display-name=DISPLAY_NAME \
        --stage-config=rollout-sequence.yaml
    

    Replace the following:

    • ROLLOUT_SEQUENCE_NAME: an immutable identifier that conforms to the RFC-1034 specifications, for example, test-rollout-sequence.
    • DISPLAY_NAME: a human-readable string for your rollout sequence.

Check status of a rollout

After you configure a rollout sequence, the system automatically creates Rollout objects to manage upgrades. You can observe and track the progress of these objects using Google Cloud CLI commands.

List rollouts

To list all active and historical rollouts in your rollout sequence host project, run the following command:

gcloud beta container fleet rollouts list --project=HOST_PROJECT_ID

Replace HOST_PROJECT_ID with the ID of your rollout sequence host project.

You can omit the --project=HOST_PROJECT_ID flag if you are already in the project where your rollout sequence is hosted.

The output is similar to the following:

NAME                                              STATE      CREATE_TIME
05eb251e4f19269e23-node-1-33-5-gke-1201000-t7mqd  COMPLETED  2025-10-30T20:07:46
05eb251e4f19269e23-kcp-1-33-5-gke-1201000-djwst   COMPLETED  2025-10-30T18:07:06
05eb251e4f19269e23-node-1-33-5-gke-1125000-6bxvu  COMPLETED  2025-10-23T17:46:54
05eb251e4f19269e23-kcp-1-33-5-gke-1125000-2f6ct   RUNNING    2025-10-23T16:41:33

In the preceding output, rollout names that contain kcp refer to control plane upgrades, and names that contain node refer to node upgrades. The segment of the rollout name after kcp or node is derived from the GKE version.

Describe a rollout

To get detailed information about a particular rollout, including the target version, state, and which clusters have been upgraded, use the describe command with the rollout ID that you obtained from the preceding command:

gcloud beta container fleet rollouts describe ROLLOUT_ID \
  --project=HOST_PROJECT_ID

Replace the following:

  • ROLLOUT_ID: the rollout ID that you obtained when you listed the rollouts.
  • HOST_PROJECT_ID: the project ID where your rollout sequence is hosted.

For example:

gcloud beta container fleet rollouts describe 927e9a989930cf3b55-kcp-1-32-4-gke-1106006 \
  --project=my-hostfleet

The output is similar to the following:

createTime: '2025-05-26T11:47:29.909959672Z'
membershipStates:
  projects/dev-project-id/locations/us-central1/memberships/c-1:
    lastUpdateTime: '2025-05-26T12:20:55.601542481Z'
    targets:
    - cluster:   projects/dev-project-id/locations/us-central1/clusters/c-1
      operation: //container.googleapis.com/v1/projects/dev-project-id/locations/us-central1/operations/operation-1234567890-abcdefg-hijklm-nopqrst
      state: SUCCEEDED
    stageAssignment: 1
  projects/dev-project-id/locations/us-central1/memberships/c-2:
    lastUpdateTime: '2025-05-26T12:22:57.151203493Z'
    targets:
    - cluster:   projects/dev-project-id/locations/us-central1/clusters/c-2
      operation: //container.googleapis.com/v1/projects/dev-project-id/locations/us-central1/operations/operation-987654321-ghijkl-mno-pqr-stu-vwxyz
      state: SUCCEEDED
    stageAssignment: 1
  projects/prod-project-id/locations/us-central1/memberships/c-1:
    lastUpdateTime: '2025-05-26T13:03:34.134308942Z'
    targets:
    - cluster: projects/prod-project-id/locations/us-central1/clusters/c-1
      operation: //container.googleapis.com/v1/projects/prod-project-id/locations/us-central1/operations/operation-567891234-efghij-klm-nopq-rstu-vwxyz
      state: SUCCEEDED
    stageAssignment: 2
  projects/prod-project-id/locations/us-central1/memberships/c-2:
    lastUpdateTime: '2025-05-26T13:06:34.025261641Z'
    targets:
    - cluster: projects/prod-project-id/locations/us-central1/clusters/c-1
      operation: //container.googleapis.com/v1/projects/prod-project-id/locations/us-central1/operations/operation-765432198-01a7b896-67c2-523-6fjjh4-icmdydh
      state: SUCCEEDED
    stageAssignment: 2
name: projects/user-hostfleet/locations/global/rollouts/05eb251e4f19269e23-kcp-1-32-4-gke-1106006
rolloutSequence: projects/project-id/locations/global/rolloutSequences/my-sequence
state: COMPLETED
updateTime: '2025-07-22T07:36:51.052691989Z'
versionUpgrade:
  desiredVersion: 1.32.4-gke.1106006
  type: TYPE_CONTROL_PLANE
stages:
- state: COMPLETED
  endTime: '2025-05-26T12:22:28.828506491Z'
  stageNumber: 1
  startTime: '2025-05-26T11:48:28.772658427Z'
  soakDuration: 600s
- state: COMPLETED
  endTime: '2025-05-26T13:06:20.026390832Z'
  stageNumber: 2
  startTime: '2025-05-26T12:32:38.419372153Z'
  soakDuration: 600s

Status information for a rollout

When you describe a rollout, the stages and membershipStates fields of the output provide the progress status of each stage and cluster within that stage, respectively.

The following table lists the potential statuses of a stage:

Status Description
PENDING The upgrade has not yet started for this stage.
RUNNING The upgrade is in progress for this stage. If you have configured a maintenance window for the clusters in the stage, GKE waits for the window to open before upgrading the clusters.
SOAKING All clusters in this stage have completed their upgrades, and the stage is in its configured soaking period.
FORCED_SOAKING The upgrade took more than the maximum upgrade time (30 days), and therefore GKE force-started the soaking phase. The upgrade still continues on any remaining clusters.
COMPLETED The stage has finished soaking, and the rollout proceeds to the next stage.

The following table lists the potential statuses of a cluster within a sequence:

Status Description
PENDING The upgrade is pending on this cluster.
INELIGIBLE This cluster is ineligible for the upgrade, possibly due to a version discrepancy. The reason for ineligibility is provided in the output.
RUNNING The upgrade is in progress on this cluster.
SUCCEEDED The upgrade has successfully completed on this cluster.
FAILED The upgrade has failed on this cluster. Failed targets are retried indefinitely while their stage is active (in a RUNNING or FORCED_SOAKING state).

Manage a rollout sequence

You can control automatic cluster upgrades with rollout sequencing in several ways, as explained in the following sections.

List your rollout sequences

To list all rollout sequences in your host project, run the following command:

gcloud beta container fleet rolloutsequences list --project=HOST_PROJECT_ID

Replace HOST_PROJECT_ID with the ID of your rollout sequence host project.

Describe a rollout sequence

To see the details of a specific rollout sequence, run the following command:

gcloud beta container fleet rolloutsequences describe ROLLOUT_SEQUENCE_NAME \
  --project=HOST_PROJECT_ID

Replace the following:

  • ROLLOUT_SEQUENCE_NAME: the name of the rollout sequence.
  • HOST_PROJECT_ID: the ID of your rollout sequence host project.

The output is similar to the following:

createTime: '2025-10-23T16:40:16.403871189Z'
displayName: my-display-name
name: projects/HOST_PROJECT_ID/locations/global/rolloutSequences/ROLLOUT_SEQUENCE_NAME
stages:
- clusterSelector:
    labelSelector: resource.labels.canary=='true'
  fleetProjects:
  - projects/FLEET_PROJECT_ID
  soakDuration: 600s
- fleetProjects:
  - projects/FLEET_PROJECT_ID
  soakDuration: 300s
uid: 5c5b2ac8-9d76-45f9-92ca-5e6bd3fbcaef
updateTime: '2025-10-23T17:11:57.285678399Z'

Modify a rollout sequence

You can modify an existing rollout sequence by editing the YAML configuration file where you defined the sequence. For example, you can update the soak time for a stage or update the stage to change the order of upgrades. After you edit the file, apply the changes.

For example, if you defined the original rollout sequence in a file named rollout-sequence.yaml, edit the file as required. Then, run the following command:

gcloud beta container fleet rolloutsequences update test-rollout-sequence \
  --display-name="My Updated Rollout Sequence" \
  --stage-config=rollout-sequence.yaml

Delete a rollout sequence

To delete a rollout sequence, run the following command:

gcloud beta container fleet rolloutsequences delete ROLLOUT_SEQUENCE_NAME \
  --project=HOST_PROJECT_ID

Replace the following:

  • ROLLOUT_SEQUENCE_NAME with the name of the rollout sequence.
  • HOST_PROJECT_ID with the ID of your rollout sequence host project.

When you delete a rollout sequence, any rollouts that are in progress for that sequence are cancelled. The clusters that were part of the sequence revert to the default auto-upgrade behavior for their enrolled release channel.

Migrate an existing rollout sequence to use custom stages

If you use the generally available fleet-based version of rollout sequencing, you can migrate to a sequence that uses custom stages by creating a new RolloutSequence that references your existing fleets.

Before you migrate the sequence, we recommend that you make a copy of your current rollout sequence configuration.

To migrate your rollout sequence, complete the following steps:

  1. Create a dedicated Google Cloud project to host your rollout sequence. This project is typically not part of the sequence; that is, the project doesn't contain fleets or clusters which are part of the sequence.
  2. If you want the rollout sequence to include specific clusters within a fleet, add labels to those clusters. This step is optional.
  3. Follow the instructions in Create a rollout sequence with custom stages.

    For example, the following manifest, which is named rollout-sequence-migrate.yaml, references the existing fleets in a previous rollout sequence. This manifest describes three stages, including a canary stage in the prod fleet:

    - stage:
      fleet-projects:
      - projects/dev
      soak-duration: 604800s
    - stage:
      fleet-projects:
      - projects/prod
      soak-duration: 604800s
      label-selector: canary=true
    - stage:
      fleet-projects:
      - projects/prod
      soak-duration: 604800s
    

Immediately after you define a new RolloutSequence for your fleets, GKE begins upgrading the fleets according to the new sequence, and removes the previous configuration.

Migrate a rollout sequence with custom stages to the previous rollout sequence

This section describes how to revert from rollout sequencing with custom stages back to the generally available fleet-based sequencing model. This process involves deleting the new RolloutSequence and restoring your original fleet-based configuration.

Prevent out-of-order upgrade during migration

To avoid unintended or out-of-order upgrades while you are reconfiguring your sequence, apply a maintenance exclusion to your production clusters. This step temporarily pauses all automatic upgrades on those clusters. For example, you can configure a maintenance exclusion of type no upgrades on your production clusters.

Delete the rollout sequence

Delete the RolloutSequence object that manages your clusters. This deletion disengages the custom stages feature.

To delete the RolloutSequence, run the following command:

gcloud beta container fleet rolloutsequences delete ROLLOUT_SEQUENCE_NAME

Replace ROLLOUT_SEQUENCE_NAME with the name of your rollout sequence.

Restore your earlier rollout sequence configuration (without custom stages)

After you delete the RolloutSequence, you can restore your original fleet-based configuration. This process involves re-creating the clusterupgrade features with their original parameters, including the upstreamFleet links and soak times for each fleet in your sequence. For more information, see Create a rollout sequence.

Remove the maintenance exclusions

After you restore your original fleet-based rollout sequencing configuration, remove the maintenance exclusion you applied in the first step in this section. GKE resumes the automatic upgrades, now governed by your restored fleet-based sequence.

What's Next