Configure TPUs with GKE dynamic slicing

This document explains how to use dynamic slicing in Google Kubernetes Engine (GKE). Dynamic slicing lets you configure provisioned TPU sub-blocks into different topologies. This capability reduces the need to re-create node pools, enhances fault tolerance by allowing automatic recovery when a failure occurs, and optimizes resource utilization.

This document is intended for AI/ML engineers and platform administrators who want to optimize TPU utilization, reduce provisioning time, and improve fault tolerance for large-scale training and inference workloads.

Before reading this document, ensure that you are familiar with the following:

What is dynamic slicing?

Dynamic slicing delivers flexibility in managing Cloud TPU capacity by letting you decouple TPU provisioning. Dynamic slicing involves the following process:

  1. You provision resources as smaller units called sub-blocks. A sub-block is the fundamental logical building unit of Ironwood (TPU7x) capacity. For Ironwood (TPU7x)—it represents a 16-node group of TPU VMs with a 4x4x4 topology of interconnected TPU chips. In the context of TPU All Capacity mode and dynamic slicing, a node pool maps directly to a sub-block.
  2. Dynamic slicing then stitches these sub-blocks together into larger slices.

Benefits of dynamic slicing

Dynamic slicing helps you to achieve the following:

  • Reduce time to provision: individually provisioning sub-blocks leads to faster overall provisioning because it minimizes the impact of any single failure.
  • Reduce time to recover: if a TPU chip failure occurs, the smallest unit of failure is a sub-block. Dynamic slicing isolates faulty sub-blocks so that workloads can be rescheduled on healthy sub-blocks faster than re-provisioning an entire large slice.
  • Reshape capacity: if you have diverse workload requirements, you don't need to delete and re-create node pools for topology changes, which would be necessary without dynamic slicing. Instead, you can dynamically reconfigure the provisioned node pools to match specified shapes.

Key elements of dynamic slicing

Dynamic slicing introduces the following key concepts:

  • Incremental provisioning of node pools: dynamic slicing uses incremental provisioning, which is a fault-tolerant provisioning model of node pools. This model converts all your TPU capacity into node pools of 16-node group of TPU VMs.
  • Slice controller: a Kubernetes Custom Resource controller running within the GKE control plane that manages dynamic slicing. The slice controller manages the lifecycle of a Slice custom resource, which represents a dynamic slice. The slice controller handles creating, continuously monitoring, and deleting the Slice. When you use a scheduler, the scheduler directs creating and deleting the Slice custom resource.
  • Slice custom resource: dynamically stitches these sub-blocks together based on the requested TPU topology. This process relies on the dynamic reconfiguration of the OCS network to connect the TPU node pools, which helps to ensure optimized performance. You can inspect the progress or health of dynamic slice formation by inspecting the Slice custom resource's status fields.

Requirements

To use dynamic slicing in GKE, you must meet the following requirements:

  • Use a Standard cluster in version 1.35.0-gke.274500 or later, in the Rapid channel.
  • Use Ironwood (TPU7x) version.
  • Use the Container-Optimized OS image for your nodes.
  • To use incremental provisioning, use All Capacity mode reservations. All Capacity mode is a feature enabled by TPU Cluster Director.

Before you begin

Before you start, make sure that you have performed the following tasks:

  • Enable the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running the gcloud components update command. Earlier gcloud CLI versions might not support running the commands in this document.

Limitations

  • A single slice must use sub-blocks within the same TPU block under a reservation. To use sub-blocks across TPU blocks, use TPU Multislices.
  • Dynamic slicing does not support topologies smaller than 4x4x4.

Use dynamic slicing in GKE with Kueue

This section describes the workflow for using dynamic slicing in GKE.

  1. View the topology and health status of All Capacity mode reservations.
  2. Enable the slice controller in your cluster.
  3. Create TPU node pools.
  4. Configure Kueue to create a Slice custom resource.
  5. Run workloads on dynamic slicing with Kueue.
  6. Clean up.

Enable the slice controller

To use dynamic slicing, enable the slice controller in your cluster.

  1. Update your cluster:

    gcloud container clusters update CLUSTER_NAME \
        --location=LOCATION \
        --enable-slice-controller
    

    Replace the following:

  2. Get credentials so that you can communicate with your cluster with kubectl commands:

    gcloud config set container/cluster CLUSTER_NAME
    gcloud container clusters get-credentials CLUSTER_NAME \
        --location=LOCATION
    
  3. In the output of the following command, verify that the slices.accelerator.gke.io value is present:

    kubectl get crd slices.accelerator.gke.io
    

    The output is similar to the following:

    slices.accelerator.gke.io                2026-01-09T23:58:02Z
    

Create node pools with incremental provisioning

This section describes how to create the TPU node pools with incremental provisioning. GKE converts all your TPU capacity into node pools of 16-node group of TPU VMs, or sub-blocks. GKE provisions these node pools even when it can't find all 16 healthy VMs by placing nodes on healthy parts of the host machine and incrementally provisioning unhealthy machines while they are repaired.

You can target your node pool to belong to any of the following:

  • A specific block of TPUs, which is exposed in All Capacity mode reservations. Block targeting allows GKE to create the node pool in any available sub-block within the specified block.
  • A specific sub-block, or a specific 16-node group of TPU VMs, of TPUs for more granular control.

Create a workload policy

To create a TPU slice node pool with Ironwood (TPU7x), you must first create a workload policy with the accelerator-topology-mode field set to provision_only. This setting triggers the incremental provisioning process.

Create a workload policy:

gcloud compute resource-policies create workload-policy WORKLOAD_POLICY_NAME \
        --project=PROJECT_ID \
        --region=REGION  \
        --type=HIGH_THROUGHPUT \
        --accelerator-topology=4x4x4 \
        --accelerator-topology-mode=provision_only

Replace the following:

  • WORKLOAD_POLICY_NAME: a name for your workload policy.
  • PROJECT_ID: your Google Cloud project ID.
  • REGION: the region for the workload policy.

In this command, do the following::

  • Always set the accelerator-topology field to 4x4x4 to match the total number of chips within a single sub-block.
  • Always set the accelerator-topology-mode field to provision_only to ensure the incremental provisioning process is triggered. When the provision_only field is set, the node pool provisions TPU nodes without forming ICI or OCS links.

Target your node pool to belong to a block or a sub-block

You can target specific sub-blocks or blocks within your All Capacity mode reservation.

  • Target a block: each node pool uses capacity from a specified block. GKE places the node pool within an available sub-block in that block. You must create as many node pools as there are sub-blocks in the block you want to use.
  • Target a sub-block: each node pool maps to a specific and available sub-block. When using sub-block targeting, GKE creates the node pool if at least one VM is healthy. Incremental provisioning ensures that all nodes are placed within the specified sub-block.

Block

  1. To retrieve the name of the block in a reservation and the count of available sub-blocks in the block, complete the following steps in the View the topology and health status of All Capacity Mode reservations document:

    1. Identify the name of the block by listing all reservation blocks and copying the value in the name: field. This value is the name of the block or BLOCK_NAME in this document.

    2. Determine how many node pools to create by describing a reservation block and identifying the value in the reservationSubBlockCount field. This value is the number of sub-blocks available. For example, the reservationSubBlockCount: 4 value indicates that the block has four sub-blocks available, and you need to create four separate node pools.

  2. Set the reservation path:

    export RESERVATION_PATH="projects/PROJECT_ID/reservations/RESERVATION_NAME/reservationBlocks/BLOCK_NAME"
    

    Replace the following:

    • RESERVATION_NAME: the name of your TPU reservation.
    • BLOCK_NAME: the name of the block.
  3. Create a node pool for each sub-block identified in the preceding step. For example, if the count is 4, run this command four times. Use a unique name for each node pool.

    gcloud container node-pools create NODE_POOL_NAME \
          --cluster=CLUSTER_NAME \
          --node-locations=ZONE \
          --machine-type=tpu7x-standard-4t \
          --num-nodes=16 \
          --placement-policy=WORKLOAD_POLICY_NAME \
          --reservation-affinity=specific \
          --reservation=${RESERVATION_PATH}
    

    Replace the following:

    • NODE_POOL_NAME: the name of your new node pool.
    • CLUSTER_NAME: the name of your GKE cluster.
    • WORKLOAD_POLICY_NAME: the name of the workload policy you created.
    • ZONE: the zone for the node pool, for example, us-central1-a.

Sub-block

  1. To retrieve the name of the block and the IDs of the available sub-blocks, complete the following steps in the View the topology and health status of All Capacity Mode reservations document:

    1. To identify the name of the block, list all reservation blocks and copy the value in the name: field. This value is the name of the block or BLOCK_NAME on this document.

    2. To identify the name of the sub-blocks, list all sub-blocks of a block and copy the value in the name: field for each entry under reservationSubBlocks. This value is the name of the sub-block or SUBBLOCK_NAME in this document.

  2. Set the reservation path:

    export RESERVATION_PATH="projects/PROJECT_ID/reservations/RESERVATION_NAME/reservationBlocks/BLOCK_NAME/reservationSubBlocks/SUBBLOCK_NAME"
    

    Replace the following:

    • RESERVATION_NAME: the name of your TPU reservation.
    • BLOCK_NAME: the name of the block.
    • SUBBLOCK_NAME: the name of the sub-block.
  3. Create the node pool:

    gcloud container node-pools create NODE_POOL_NAME \
            --project=PROJECT_ID \
            --cluster=CLUSTER_NAME \
            --node-locations=ZONE \
            --machine-type=tpu7x-standard-4t \
            --num-nodes=16 \
            --placement-policy=WORKLOAD_POLICY_NAME \
            --reservation-affinity=specific \
            --reservation=${RESERVATION_PATH}
    

    Replace the following:

    • NODE_POOL_NAME: a unique name for your new node pool, for example, sub-block-pool-1.
    • PROJECT_ID: your Google Cloud project ID.
    • CLUSTER_NAME: the name of your GKE cluster.
    • ZONE: the zone for the node pool, for example, us-central2-b.
    • WORKLOAD_POLICY_NAME: the name of the workload policy you created.

At this stage, the nodes are created, but their Inter-Chip Interconnect (ICI) links are not yet active. Therefore, you can't run workloads on these node pools directly.

To enable all the necessary ICI links to form the slice and allow workloads to be scheduled, create a dynamic slice by using one of the following methods:

  • Create a Slice custom resource. Instead of Pods, you use a Slice custom resource to define the specified topology, which the slice controller activates.
  • Schedule GKE workloads with Kueue and TAS. Kueue automatically handles the creation and deletion of Slice custom resources. Avoid manually modifying Slice custom resources created by Kueue.

Create a dynamic slice with Kueue and TAS

In this section, you schedule GKE workloads with Kueue and TAS.

Install JobSet and Kueue resources for dynamic slicing

  1. Install JobSet:

    helm install jobset oci://registry.k8s.io/jobset/charts/jobset \
            --version 0.10.1 \
            --namespace jobset-system \
            --create-namespace \
            --set controller.resources.requests.cpu=4 \
            --set controller.resources.requests.memory=16Gi
    
  2. Install Kueue:

    helm install kueue oci://registry.k8s.io/kueue/charts/kueue \
            --version 0.16.1 \
            --namespace kueue-system \
            --create-namespace \
            --wait \
            --set controllerManager.replicas=3 \
            --set controllerManager.manager.resources.requests.cpu=16 \
            --set controllerManager.manager.resources.requests.memory=64Gi
    
  3. Install Kueue slice controller:

    kubectl apply -f https://gist.githubusercontent.com/mwysokin/cd90010d0d375b3bf57c536905692547/raw/506c36dd070f4ac222ba8a5e58ba28bbfcfa8ed3/kueue-slice-controller-v0.8.0-130.yaml
    
  4. To configure Kueue for dynamic slicing, save the following manifest as dynamic-slice-topology.yaml:

    apiVersion: kueue.x-k8s.io/v1beta1
    kind: Topology
    metadata:
      name: superslice-topology
    spec:
      levels:
      # Label to identify the physical block a sub-block belongs to.
      # Only sub-blocks from the same block can form a slice.
      - nodeLabel: cloud.google.com/gce-topology-block
      # Label to identify individual TPU sub-blocks (4x4x4 topology).
      - nodeLabel: cloud.google.com/gke-tpu-partition-4x4x4-id
      # Standard Kubernetes label for individual nodes.
      # Required to assign Pods to specific VMs.
      - nodeLabel: kubernetes.io/hostname
    ---
    apiVersion: kueue.x-k8s.io/v1beta1
    kind: ResourceFlavor
    metadata:
      name: superslice-rf
    spec:
      nodeLabels:
        cloud.google.com/gke-tpu-accelerator: tpu7x
      topologyName: superslice-topology
    ---
    apiVersion: kueue.x-k8s.io/v1beta1
    kind: AdmissionCheck
    metadata:
      name: superslice-ac
    spec:
      controllerName: accelerator.gke.io/slice
    ---
    apiVersion: kueue.x-k8s.io/v1beta1
    kind: ClusterQueue
    metadata:
      name: cq
    spec:
      namespaceSelector: {}
      admissionChecks:
      - superslice-ac
      resourceGroups:
      - coveredResources:
        - google.com/tpu
        flavors:
        - name: superslice-rf
          resources:
          - name: google.com/tpu
            nominalQuota: "999999"  # modeling unlimited quota
    ---
    apiVersion: kueue.x-k8s.io/v1beta1
    kind: LocalQueue
    metadata:
      name: lq
      namespace: default
    spec:
      clusterQueue: cq
    
  5. Apply the dynamic-slice-topology.yaml manifest:

    kubectl apply -f dynamic-slice-topology.yaml
    

    In this manifest, you configure Kueue for dynamic slicing by defining the following resources:

    • Ironwood (TPU7x) dynamic slice topology (superslice-topology): the topology defines the levels Kueue considers when it schedules dynamic slicing workloads. These levels are the following:
      • cloud.google.com/gce-topology-block label: this level is required to understand which sub-blocks belong to which blocks, because only sub-blocks from the same block can form a slice.
      • cloud.google.com/gke-tpu-partition-4x4x4-id label: this level represents individual Ironwood (TPU7x) sub-blocks (4x4x4 topology).
      • kubernetes.io/hostname label: this level is required to assign Pods to specific VMs and to observe their labels and taints.
    • Ironwood (TPU7x) SuperSlice ResourceFlavor (superslice-rf): the resource flavor for Ironwood (TPU7x) sub-blocks includes the cloud.google.com/gke-tpu-accelerator: tpu7x label to match nodes with Ironwood (TPU7x) machines.
    • SuperSlice AdmissionCheck (superslice-ac): this admission check tells Kueue not to schedule a workload until the GKE slice controller confirms that the slice has become active. The admission check is first defined and then added to the ClusterQueue that handles dynamic slicing workloads.
    • ClusterQueue (cq) and LocalQueue (lq): these fields manage google.com/tpu resources. The cq ClusterQueue includes the superslice-ac admission check. The nominalQuota for google.com/tpu can be configured in two ways:
      • Specific quota: set nominalQuota to match existing capacity for fair-sharing and quota management.
      • Unlimited quota: set nominalQuota to a very high value such as "999999", to model unlimited quota. To focus on TAS and dynamic slicing, this configuration bypasses Kueue's quota management functionality.

Define the sub-block health selection

Beyond standard node health and readiness, GKE exposes the specific state of each sub-block by using the cloud.google.com/gke-tpu-partition-4x4x4-state label. This label lets GKE account for factors that influence slice formation, such as the state of TPU links.

You can define the value of the cloud.google.com/gke-tpu-partition-4x4x4-state label as follows:

  • HEALTHY: the infrastructure is healthy.
  • DEGRADED: the sub-block's infrastructure is in a degraded state, for example, because of OCS link degradation. The sub-block can still form a slice, but overall performance might be lower compared to healthy sub-blocks. If you can tolerate potentially degraded performance, you can configure your workload to use DEGRADED sub-blocks by using node affinity, as shown in Example 3.
  • UNHEALTHY: the sub-block is unhealthy and can't form a slice.

The Kueue Slice Controller webhook validates if a workload includes a specific sub-block health requirement. If no preference is indicated, the webhook injects a default node affinity.

The behavior is as follows:

  • If a nodeSelector or nodeAffinity that targets the cloud.google.com/gke-tpu-partition-4x4x4-state label is present, it remains unchanged.
  • If no such label configuration exists, the webhook injects the following default node affinity to ensure only available sub-blocks are used:

    nodeSelector:
      cloud.google.com/gke-tpu-partition-4x4x4-state: "HEALTHY"
    

The following section includes examples where the cloud.google.com/gke-tpu-partition-4x4x4-state label is configured to specify the different sub-block health configurations.

Run test workloads on dynamic slicing with Kueue

This section describes how to deploy workloads on dynamic slicing with Kueue and TAS. It includes three examples that show how to create a dynamic slice workload and a workload consisting of multiple slices. The workloads are submitted as JobSets.

Example 1: Single workload uses a single dynamic slice

The following example describes how to create a workload using a slice with a 4x12x16 topology, which is composed of 12 sub-blocks. The number of Pods was calculated as: (4 * 12 * 16) / 4 chips per node = 192 Pods.

  1. Save the following manifest as big-super-slice.yaml:

    apiVersion: jobset.x-k8s.io/v1alpha2
    kind: JobSet
    metadata:
      name: big-super-slice
      labels:
        kueue.x-k8s.io/queue-name: lq
      annotations:
    spec:
      replicatedJobs:
        - name: job-jax
          replicas: 1
          template:
            spec:
              parallelism: 192  # pods per slice calculation: 4*12*16 / 4 = 192
              completions: 192
              backoffLimit: 10
              template:
                metadata:
                  annotations:
                    cloud.google.com/gke-tpu-slice-topology: 4x12x16
                spec:
                  tolerations:
                    - key: "google.com/tpu"
                      operator: "Equal"
                      value: "present"
                      effect: "NoSchedule"
                  nodeSelector:
                    cloud.google.com/gke-tpu-accelerator: tpu7x
                    cloud.google.com/gke-tpu-partition-4x4x4-state: "HEALTHY"
                  containers:
                    - name: jax
                      image: python:latest
                      command:
                        - bash
                        - -c
                        - |
                          printenv
                          pip install "jax[tpu]" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html
                          python -c 'import jax; print("Global device count:", jax.device_count(), "Local device count:", jax.local_device_count())'
                      resources:
                        limits:
                          google.com/tpu: 4
                  restartPolicy: Never
    

    In this manifest, the following annotations tell Kueue the slice characteristics and topology to configure the following:

    • cloud.google.com/gke-tpu-slice-topology: specifies "4x12x16" as the dynamic slice topology. Requirements for the tpu7x accelerator topology include the following rules:
      • The minimum topology is 4x4x4.
      • The topology must be a three-dimensional string in the format AxBxC. For example, 4x8x8.
      • Each dimension (A, B, and C) must be a multiple of four.
      • The dimensions must be sorted in non-decreasing order: A <= B <= C. For example, 4x8x4 is invalid; it should be 4x4x8.
      • The product of the dimensions (ABC) must not exceed 9,216.
      • The largest supported slice topologies can include up to 32 sub-blocks. For example, 8x16x16 with 32 sub-blocks,8x12x20 with 30 sub-blocks, or 12x12x12 with 27 sub-blocks are within the accepted limits.
    • cloud.google.com/gke-tpu-accelerator: tpu7x: schedules Pods on on VMs that run Ironwood (TPU7x).
    • kueue.x-k8s.io/queue-name: assigns the JobSet to a Kueue LocalQueue.
  2. Apply the big-super-slice.yaml manifest:

    kubectl apply -f big-super-slice.yaml
    

    After you apply the manifest, Kueue creates a JobSet named big-super-slice. Kueue then attempts to form a single dynamic slice with a 4x12x16 topology. After the slice is active, Kueue admits the workload, and the 192 Pods are scheduled on the nodes to form the dynamic slice that runs your workloads.

Example 2: Workload with more than one replica

The following example demonstrates how to create a workload that uses two dynamic slices, each composed of four sub-blocks.

  1. Save the following manifest as two-super-slices.yaml:

    apiVersion: jobset.x-k8s.io/v1alpha2
    kind: JobSet
    metadata:
    name: two-super-slices
    labels:
        kueue.x-k8s.io/queue-name: lq
    annotations:
    spec:
    replicatedJobs:
        - name: job-jax
        replicas: 2
        template:
            spec:
            parallelism: 64  # Pods per slice calculation: (4*8*8) / 4 = 64
            completions: 64
            backoffLimit: 10
            template:
                metadata:
                annotations:
                    cloud.google.com/gke-tpu-slice-topology: 4x8x8
                spec:
                tolerations:
                    - key: "google.com/tpu"
                    operator: "Equal"
                    value: "present"
                    effect: "NoSchedule"
                nodeSelector:
                    cloud.google.com/gke-tpu-accelerator: tpu7x
                    cloud.google.com/gke-tpu-partition-4x4x4-state: "HEALTHY"
                containers:
                    - name: jax
                    image: python:latest
                    command:
                        - bash
                        - -c
                        - |
                        printenv
                        pip install "jax[tpu]" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html
                        python -c 'import jax; print("Global device count:", jax.device_count(), "Local device count:", jax.local_device_count())'
                    resources:
                        limits:
                        google.com/tpu: 4
                restartPolicy: Never
    
  2. Apply the two-super-slices.yaml manifest:

    kubectl apply -f two-super-slices.yaml
    

In this manifest, you set replicas: 2 in the replicatedJobs field. After you apply the manifest, Kueue attempts to form two separate slices with a 4x8x8 topology. Kueue creates a dynamic slice for each replica defined in jobset.spec.replicatedJobs[].replicas. If n replicas are specified, Kueue creates n dynamic slices for the workload and waits for all slices to become active before admitting the workload.

Example 3: Workload with single dynamic slice and NodeAffinity

Starting from Kueue 0.15, Kueue supports NodeAffinity for TAS node selection. This functionality can be used to allow for both HEALTHY and DEGRADED nodes to be part of a dynamic slice. The following example shows how to configure a workload with single dynamic slice and NodeAffinity:

  1. Save the following manifest as slice-8x8x8-na.yaml:

    apiVersion: jobset.x-k8s.io/v1alpha2
    kind: JobSet
    metadata:
      name: slice-8x8x8-na
      labels:
        kueue.x-k8s.io/queue-name: lq
    spec:
      replicatedJobs:
        - name: rj1
          replicas: 1
          template:
            spec:
              parallelism: 128
              completions: 128
              backoffLimit: 10
              template:
                metadata:
                  annotations:
                    cloud.google.com/gke-tpu-slice-topology: 8x8x8
                spec:
                  tolerations:
                    - key: "google.com/tpu"
                      operator: "Equal"
                      value: "present"
                      effect: "NoSchedule"
                  nodeSelector:
                    cloud.google.com/gke-tpu-accelerator: tpu7x
                  affinity:
                    nodeAffinity:
                      requiredDuringSchedulingIgnoredDuringExecution:
                        nodeSelectorTerms:
                          - matchExpressions:
                              - key: cloud.google.com/gke-tpu-partition-4x4x4-state
                                operator: In
                                values:
                                  - "HEALTHY"
                                  - "DEGRADED"
                  containers:
                    - name: jax
                      image: python:latest
                      command:
                        - bash
                        - -c
                        - |
                          printenv
                          pip install "jax[tpu]" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html
                          python -c 'import jax; print("Global device count:", jax.device_count(), "Local device count:", jax.local_device_count())'
                      resources:
                        limits:
                          google.com/tpu: 4
                  restartPolicy: Never
    
  2. Apply the slice-8x8x8-na.yaml manifest:

    kubectl apply -f slice-8x8x8-na.yaml
    

    After you apply the manifest, Kueue creates a JobSet named slice-8x8x8-na. Kueue then attempts to form a single dynamic slice with an 8x8x8 topology, which allows for both HEALTHY and DEGRADED nodes to be included due to the specified NodeAffinity. After the slice is active, Kueue admits the workload, and the 128 Pods are scheduled on the nodes forming the dynamic slice.

Monitor the status of the slice

To check the status of your dynamic slices, run the following command:

kubectl describe slice SLICE_NAME

Replace SLICE_NAME with the name of your slice. The slice name is typically derived from the JobSet name and replica index. For Example 1, a slice created by Kueue would have a name similar to default-jobset-big-super-slice-yyyyy-job-jax-0.

The output is similar to the following:

Name:         test-slice
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  accelerator.gke.io/v1beta1
Kind:         Slice
Metadata:
  Creation Timestamp:  2026-02-12T23:44:28Z
  Finalizers:
    accelerator.gke.io/slice-finalizer
  Generation:        1
  Resource Version:  1770939905695871008
  UID:               6dbbfe14-4486-4462-864d-e078d0ca8b5b
Spec:
  Partition Ids:
    5eae6a4f59d59cf30a9bf49de618eb2b
  Topology:  4x4x4
  Type:      tpu7x
Status:
  Conditions:
    Last Transition Time:  2026-02-12T23:45:05Z
    Message:
    Reason:                ACTIVE
    Status:                True
    Type:                  Ready
    Last Transition Time:  2026-02-12T23:45:05Z
    Message:               NodeLabelingCompleted
    Reason:                NodeLabelIsAdded
    Status:                True
    Type:                  NodeLabeled
Events:                    <none>

The slice name adheres to the following rules to ensure compatibility with underlying Compute Engine resource naming conventions:

  • Template: {namespace}-jobset-{jobset.metadata.name}-kueueHash[5-character]-{jobset.spec.replicatedJobs[].name}-sliceIndex.
  • Length: the name has 54 characters or fewer. The controller appends a hyphen and an 8-character cluster hash to create Compute Engine resource names, which have a 63-character limit.
  • Format: the name matches the regular expression ^[a-z]([-a-z0-9]*[a-z0-9])?$. The name has the following characteristics:
    • Starts with a lowercase letter.
    • Only contains lowercase letters, numbers, and hyphens (-).
    • Ends with a lowercase letter or a number (it cannot end with a hyphen).

Clean up

To avoid unexpected charges, delete your slices before deleting node pools.

  1. Delete the JobSet. This action triggers Kueue to delete the associated Slice custom resources.

    kubectl delete jobset JOBSET_NAME
    

    Replace JOBSET_NAME with the name of your JobSet, for example, big-super-slice.

  2. Delete the TPU node pool:

    gcloud container node-pools delete NODE_POOL_NAME \
        --cluster=CLUSTER_NAME \
        --location=LOCATION
    

(Optional) Use dynamic slicing with your own scheduler

This document focuses on using Kueue and TAS. However, you can also manage dynamic slicing with your own custom scheduler. If you choose to use a different scheduler, follow the Slice custom resource reference information.

What's next