Google Kubernetes Engine New Features - Release notes

April 08, 2026

2026-04-08T00:00:00-07:00

Feature

Gateway API v1.5 is supported in GKE version 1.35.2-gke.1842000 and later. The GKE Gateway controller passes core conformance tests for this version of the Gateway API.

Feature

GKE managed DRANET is now Generally Available (GA) for GKE version 1.35.2-gke.1842000 or later.

GKE DRANET is a managed feature that implements the Kubernetes Dynamic Resource Allocation (DRA) API for high-performance networking. The GA release expands support beyond the preview phase to include the following hardware:

NVIDIA GPU Instances: Support for instances starting from A3 Ultra, including A4, A4X, and A4X Max.
Cloud TPU Instances: Support for TPU v6e and TPU v7x.

For more information, see Allocate network resources by using GKE managed DRANET.

March 25, 2026

2026-03-25T00:00:00-07:00

Feature

To provide more controls over the control plane version upgrade, you can now do the following:

Configure a frequency of disruption from auto-upgrades by using the cluster disruption budget. For more information, see Control the frequency of disruption from auto-upgrades.
Continue using an existing control plane patch for a longer period, which facilitates large-scale upgrade and downgrade operations. For more information, see Patch version support.

March 13, 2026

2026-03-13T00:00:00-07:00

Feature

In GKE version 1.35 and later, all organization and cluster administrators can granularly control which privileged Autopilot partner workloads can run in GKE clusters. Additionally, approved customers can authorize and run their own privileged workloads in Autopilot mode by using custom allowlists.

For more information, see About Autopilot privileged workloads.

March 10, 2026

2026-03-10T00:00:00-07:00

Feature

Managed OpenTelemetry for GKE is available in Preview for clusters running version 1.34.1-gke.2178000 or later. Managed OpenTelemetry for GKE provides a fully managed and simplified experience for collecting OpenTelemetry Protocol (OTLP) traces, metrics, and logs on GKE. This feature includes the following characteristics:

Managed collection: an in-cluster OTLP endpoint that automatically routes telemetry to the Cloud Telemetry API.
Automatic configuration: a new Instrumentation custom resource that automatically injects environment variables into your workloads to simplify OTLP ingestion.

For more information, see Managed OpenTelemetry for GKE.

March 05, 2026

2026-03-05T00:00:00-08:00

Feature

GKE Inference Quickstart (GIQ) now offers recommendations for distributed AI inference. This enables you to deploy optimized, full configurations for advanced models, such as the Qwen and gpt-oss model families, on NVIDIA GPUs and Cloud TPUs.

This release introduces GKE Inference Gateway by integrating llm-d inference scheduling. You can select optimized configurations for workloads like Advanced Customer Support, Code Completion, and Deep Research. This tunes your infrastructure to meet the specific latency and throughput requirements of these applications.

For more information, see Analyze model serving performance and costs with Inference Quickstart.

Feature

You can use automated disk type selection for Hyperdisk volumes on GKE. This feature allows GKE to automatically select the most appropriate disk type based on the machine type of the node where your workload is scheduled.

With this feature, you can create a single StorageClass that supports clusters with mixed VM generations. For example, GKE can provision Hyperdisk on compatible instances (such as C3 or C4) while automatically falling back to Persistent Disk on other generations.

For more information, see Automated disk type selection.

Feature

The H4D machine series, designed for high performance computing (HPC) workloads, is generally available for GKE clusters. Based on 5th generation AMD EPYC Turin with Cloud RDMA 200 Gbps networking, H4D VMs offer 192 cores (SMT disabled), up to 1,488 GB of memory, and 3,750 GiB of Local SSD. H4D is optimized for tightly-coupled applications that scale across multiple nodes and offers RDMA-enabled 200 Gbps networking.

You can use H4D with GKE clusters in Standard, or with the Performance compute class in Autopilot. For more information, see Run high performance computing (HPC) workloads with H4D.

February 24, 2026

2026-02-24T00:00:00-08:00

Feature

The release note for November 11, 2025 has been updated to correct the version requirements for using N4D machine types. Cluster autoscaler was incorrectly included in the list of features requiring GKE version 1.34.1-gke.2037000 or later. You can use any available GKE version to use N4D and Cluster autoscaler.

Feature

You can create a bare metal instance from the C4A machine series with the c4a-highmem-96-metal machine type. This machine type is available in Public Preview for Standard clusters running GKE version 1.35.0-gke.2232000 or later. You can select this machine type by using the --machine-type flag when creating a cluster or node pool. For more information about the requirements and limitations of this machine type, see the Requirements and limitations section of the "Arm workloads on GKE" document.

February 13, 2026

2026-02-13T00:00:00-08:00

Feature

You can now determine the status and health of a TPU slice and partition by monitoring these new beta system metrics:

kubernetes.io/accelerator/slice/state: Indicates the current status of the slice.
kubernetes.io/accelerator/partition/state: Indicates the health of the partition.

For more information, see the GKE system metrics documentation.

February 05, 2026

2026-02-05T00:00:00-08:00

Feature

Image streaming is now available in the asia-southeast3 region. For more information, see the Image streaming documentation.

February 03, 2026

2026-02-03T00:00:00-08:00

Feature

Image streaming and secondary boot disks are now generally available (GA) for nodes using the Ubuntu with containerd (UBUNTU_CONTAINERD) image type. These features improve workload startup performance on GKE Standard and Autopilot clusters through image data streaming and preloaded disk data. To use these features on Ubuntu nodes, your cluster must be running GKE version 1.35.0-gke.1403000 or later.

For more information, see the documentation for Image Streaming and Using Secondary Boot Disks.

January 27, 2026

2026-01-27T00:00:00-08:00

Feature

Stream Control Transmission Protocol (SCTP) support on GKE Dataplane V2 is now generally available (GA). You can now deploy workloads that use SCTP on GKE Standard clusters. This feature enables direct SCTP communication for Pod-to-Pod and Pod-to-Service traffic.

SCTP support requires clusters to use GKE Dataplane V2 and Ubuntu node images. This feature is available in GKE version 1.32.2-gke.1297000 or later.

For more information, see Deploy workloads with SCTP.

January 26, 2026

2026-01-26T00:00:00-08:00

Feature

The N4A machine series is generally available for GKE clusters in Autopilot and Standard modes. For more information, see Arm workloads on GKE.

January 21, 2026

2026-01-21T00:00:00-08:00

Feature

You can now determine which Kubernetes JobSets are scheduled on which GKE node pools and nodes by monitoring the new generally available system metrics:

kubernetes.io/jobset/assigned_node_pools: GKE node pools where a Kubernetes JobSet has scheduled Pods.
kubernetes.io/jobset/assigned_nodes: GKE nodes where a Kubernetes JobSet has scheduled Pods.
kubernetes.io/node_pool/assigned_jobsets: Kubernetes JobSets that have scheduled Pods on a GKE node pool.
kubernetes.io/node/assigned_jobsets: Kubernetes JobSets that have scheduled Pods on a GKE node.

January 20, 2026

2026-01-20T00:00:00-08:00

Feature

The asia-southeast3 region in Bangkok, Thailand is available. For more information, see the Global Locations.

January 07, 2026

2026-01-07T00:00:00-08:00

Feature

NodeLocal DNSCache is enabled by default on new Standard GKE clusters which are created running version 1.34.1-gke.3720000 or later. NodeLocal DNSCache is a GKE add-on that improves DNS performance by running a DNS cache directly on each cluster node as a DaemonSet. To learn more, see Set up NodeLocal DNSCache.

December 29, 2025

2025-12-29T00:00:00-08:00

Feature

New features in 1.35

In-place Pod Resize: In-place Pod Resize is now GA. This feature allows Pod CPU and memory requests and limits to be modified in-place without Pod or container restart.
Writable cgroups: GKE Writable cgroups for containers is now GA. This feature allows workloads to manage resources for child processes using the Linux cgroups API, improving reliability for applications like Ray.

December 19, 2025

2025-12-19T00:00:00-08:00

Feature

Rollout sequencing with custom stages is now available in Preview. This feature offers granular control over upgrading groups of clusters within a fleet, allowing you to progressively roll out GKE versions across environments. For more information see About rollout sequencing with custom stages.

December 15, 2025

2025-12-15T00:00:00-08:00

Feature

GKE Autopilot now supports N4A machine types in Public Preview, available on clusters running version 1.34.1-gke.3403001 or later.

December 10, 2025

2025-12-10T00:00:00-08:00

Feature

In GKE version 1.34.1-gke.2541000 and later, you can specify secure tags for firewalls in the spec.nodePoolConfig.resourceManagerTags field in ComputeClasses. GKE adds those secure tags to the nodes that GKE creates for that ComputeClass, so that you can target nodes by using these tags in firewall policies. For more information, see Selectively enforce firewall policies in GKE.

December 03, 2025

2025-12-03T00:00:00-08:00

Feature

GKE Inference Gateway is generally available (GA) and ready for production workloads. This release introduces major performance, security, and usability enhancements since the Public Preview.

Stable v1 API: The API has graduated to v1. The InferenceModel resource is replaced by the InferenceObjective resource for a clearer definition of serving goals. A zero-downtime migration path is available.
Prefix-Aware Routing: A new, intelligent routing feature inspects request context and routes requests with shared prefixes (like in conversational AI) to the same model replica. This can maximize KV cache hits and improve Time-to-First-Token (TTFT) latency by up to 96%.
API Key Authentication: Secure your endpoints by enforcing API key validation through a new integration with Apigee.
Body-Based Routing: The gateway can route requests using the model field directly from the HTTP request body, which enables native compatibility with the OpenAI API specification.

For more information see About GKE Inference Gateway and Deploy GKE Inference Gateway.

November 27, 2025

2025-11-27T00:00:00-08:00

Feature

TPU7x (Ironwood), Google's seventh-generation TPU for large-scale AI workloads, is available in Preview in GKE Standard clusters that run version 1.34.0-gke.2201000 and later, and in Autopilot clusters that run version 1.34.1-gke.3084001 and later. TPU7x offers a significant performance increase compared to previous generations, with 2307 TFLOPs of BF16 performance and 192 GB of high-bandwidth memory (HBM) per chip. For more information, see Get started with Ironwood (TPU7x).

November 24, 2025

2025-11-24T00:00:00-08:00

Feature

Fast-starting nodes are now generally available. GKE provisions fast-starting nodes on a best-effort basis in Autopilot when workloads use compatible configurations. For more information, see About quicker workload startup with fast-starting nodes.

November 17, 2025

2025-11-17T00:00:00-08:00

Feature

NVIDIA recommends that Kubernetes clusters enable Coherent Driver-Based Memory Management (CDMM) to resolve memory over-reporting. CDMM is enabled by default on A4X nodes running the R580 GPU driver in GKE clusters with the following versions:

1.33 or later: 1.33.4-gke.1036000 or later
1.32: 1.32.8-gke.1108000 or later

CDMM allows GPU memory to be managed through the driver instead of the operating system (OS), avoiding OS onlining of GPU memory, and exposing the GPU memory as a Non-Uniform Memory Access (NUMA) node to the OS.

For more information about CDMM, see Hardware and Software Support. To create GKE clusters with A4X, see the following documents:

November 11, 2025

2025-11-11T00:00:00-08:00

Feature

This note was updated on February 24, 2026. Cluster autoscaler was incorrectly included in the list of features which required 1.34 support with N4D. The correct features are now listed as follows.

The N4D machine family is now Generally Available (GA) for Standard and Autopilot mode. N4D instances are powered by the fifth generation AMD EPYC SP5 processors (Turin). The N4D machine series is available as follows:

Compute classes, node pool auto-creation, and Autopilot mode: GKE version 1.34.1-gke.2037000 and later.
Manually created node pools in Standard mode: all available GKE versions.

For more information, see N4D machine series.

November 07, 2025

2025-11-07T00:00:00-08:00

Feature

In GKE version 1.34.1-gke.2037001 and later, the GKE logging agent in your clusters can process logs up to two times faster per node than in version 1.33 and earlier. The logging agent also uses less node resources, which improves efficiency especially if you use high-throughput logging. These improvements to the logging agent are automatically enabled in version 1.34.1-gke.2037001 and later.

Feature

In version 1.34.1-gke.1829001 and later, GKE can auto-create multiple node pools concurrently to improve the speed with which multiple new node pools become ready.

Feature

In GKE version 1.35 and later, GKE rejects anonymous requests to cluster endpoints (except for the livez, /healthz, and /readyz health check endpoints) by default for all new Autopilot or Standard clusters. Existing clusters aren't affected by this change. To allow anonymous requests to cluster endpoints, explicitly specify a value of ENABLED in the --anonymous-authentication-config flag or the AnonymousAuthenticationConfig.mode API field. For more information, see Restrict anonymous access to cluster endpoints.

October 31, 2025

2025-10-31T00:00:00-07:00

Feature

The Multi-Cluster Services (MCS) feature has been updated with a finalizer to more effectively prevent potential resource leaks and ensure a full cleanup during the feature's disablement process. As a result of this improvement, the disablement procedure has been updated. For more details on how to disable MCS, see Disabling MCS.

October 28, 2025

2025-10-28T00:00:00-07:00

Feature

Autoscaled blue-green upgrades are a type of node upgrade strategy that maximizes the amount of time before disruption-intolerant workloads are evicted, while minimizing cost. This feature is available in Preview for GKE Standard node pools. For more information, see Autoscaled blue-green upgrades.

Feature

You can use the G4 VM, powered by NVIDIA's RTX PRO 6000 GPUs, with GKE Autopilot in version 1.34.1-gke.1829001 or later. To get started, see Deploy GPU workloads in Autopilot.

October 21, 2025

2025-10-21T00:00:00-07:00

Feature

The G4 VM, powered by NVIDIA's RTX PRO 6000 Blackwell Server Edition GPUs with the AMD EPYC Turin CPU platform, is generally available on GKE. G4 instances have up to 384 vCPUs, 1,440 GB of memory, 12 TiB of Titanium SSD disks attached, and up to 400 Gbps of standard network performance. The G4 VM offers a leap in performance with up to 9 times the throughput of G2 instances for workloads such as AI development, and graphics rendering. G4 VMs are currently available with 1, 2, 4, or 8 GPUs.

For GKE Standard, use GKE version 1.34.0-gke.1662000 or later. To get started, see Run GPUs in GKE Standard node pools.

October 09, 2025

2025-10-09T00:00:00-07:00

Feature

The following networking features are available:

In GKE version 1.33.4-gke.1055000 or later, you can control how external traffic reaches your Services on GKE clusters by using Network Service Tiers. You can configure the network tier to use either Standard Tier or Premium Tier when you create or update clusters or when you update LoadBalancer Services. For more information, see Configure external traffic with Network Service Tiers.
Starting with GKE versions 1.33 and later, you can enable automatic IP address management (auto IPAM) on GKE clusters. Auto IPAM dynamically adds or removes additional IP address ranges for nodes and Pods as the cluster scales up or down. This feature eliminates the need for large, potentially wasteful, upfront IP reservations and manual intervention during cluster scaling. For more information, see Use auto IP address management.
In GKE version 1.30.3-gke.1211000 and later, you can assign additional subnets to a VPC-native cluster. Additional subnets assigned to a cluster let you create new node pools where IPv4 addresses for both nodes and Pods come from the additional subnet ranges. This enhancement removes single-subnet limitations, increases scalability, and enhances the flexibility of your GKE clusters. For more information, see Add subnets to clusters.

Feature

For AI models deployed on a GKE cluster, you can view details about these deployments in the Google Cloud console. The pages include deployment details, logs, and observability dashboards.

October 07, 2025

2025-10-07T00:00:00-07:00

Feature

Starting with GKE version 1.33.2-gke.1240000 and later, you can specify the network tier (Standard or Premium) for ephemeral IP addresses used by the gke-l7-regional-external-managed-mc GatewayClass. For more information, see Configure Network Tier.

October 01, 2025

2025-10-01T00:00:00-07:00

Feature

The GKE cluster autoscaler now allows for a significantly longer node drain time. From GKE version 1.32.7-gke.1079000 and later, the graceful node drain timeout has been increased from 10 minutes to 1 hour. For more information, see How cluster autoscaler works.

Feature

The InPlaceOrRecreate mode for Vertical Pod Autoscaler (VPA) is now available for Public Preview in GKE.

This mode uses In-Place Pod Resize (IPPR/IPPU), which allows VPA to automatically adjust workload resources, without requiring Pod recreation. This seamless rightsizing capability helps ensure better service continuity and helps minimize costs by optimizing resource allocation, particularly during idle periods.

VPA is enabled by default in Autopilot clusters. For Standard clusters, you must first enable VPA. For more information on configuring a VPA object, see Set Pod resource requests automatically.