Kubernetes workloads for high availability

Shared Standard

This document recommends Kubernetes container workload strategies to make your applications more fault tolerant in a Google Distributed Cloud (GDC) air-gapped multi-zone universe. GDC supports Kubernetes-native container applications that are widely consumed and supported on Google Kubernetes Engine (GKE).

The strategies in this document are intended for application developers who are responsible for creating resilient workloads.

For more information, see Audiences for GDC air-gapped documentation.

Kubernetes considerations for HA apps

Achieving high availability (HA) in Kubernetes goes beyond just the control plane. You must also design and deploy container workloads in your Google Distributed Cloud (GDC) air-gapped universe resiliently. Kubernetes offers several powerful mechanisms to minimize downtime and provide highly available services even when facing infrastructure issues or during routine maintenance. The following topics are key strategies to consider for HA:

Maintain availability with replicas and autoscaling: ensure your application has enough running instances to handle its latency demand and survive individual pod failures.
- Pod replicas: maintain a stable set of identical pod replicas. If a pod fails, the ReplicaSet controller automatically creates a new pod to replace it. For more information, see the Kubernetes ReplicaSet documentation.
- Horizontal Pod Autoscaler (HPA): automatically adjust the number of replicas based on metrics like CPU utilization, letting your application handle traffic spikes. For more information, see the Kubernetes HPA documentation.
Minimize downtime with PodDisruptionBudget (PDB): use a PDB to limit the number of pods that can be down simultaneously during voluntary disruptions, such as node maintenance. This strategy ensures a minimum level of availability. For more information, see the Kubernetes PDB documentation.
Spread risk with anti-affinity rules: use pod anti-affinity to ensure that multiple replicas of the same application aren't scheduled on the same node. This strategy prevents a single infrastructure failure from taking down all replicas. For more information, see the Kubernetes affinity documentation.
Health checks with probes: implement liveness, readiness, and startup probes to help Kubernetes detect and recover from unhealthy pods. Probes ensure that traffic only reaches pods capable of handling it. For more information, see the Kubernetes probe documentation.
Stable endpoints with services: use Kubernetes Service objects to provide a stable IP address and DNS name for your pods, decoupling clients from individual pod lifecycles and providing internal load balancing. For more information, see the Kubernetes Services documentation.
Graceful updates and rollbacks with deployments: use Kubernetes Deployment objects to provide graceful updates and rollbacks for your container workloads. For more information, see Kubernetes Deployment documentation.
Set resource requests and limits: define the CPU and memory resources required for your containers. These specifications help the Kubernetes scheduler effectively place pods to prevent resource contention from affecting availability. For more information, see the Kubernetes resource management documentation.

Kubernetes workloads for high availability Stay organized with collections Save and categorize content based on your preferences.

Kubernetes considerations for HA apps

What's next

Kubernetes workloads for high availability