This document recommends Kubernetes container workload strategies to make your applications more fault tolerant in a Google Distributed Cloud (GDC) air-gapped multi-zone universe. GDC supports Kubernetes-native container applications that are widely consumed and supported on Google Kubernetes Engine (GKE).
The strategies in this document are intended for application developers who are responsible for creating resilient workloads.
For more information, see Audiences for GDC air-gapped documentation.
Kubernetes considerations for HA apps
Achieving high availability (HA) in Kubernetes goes beyond just the control plane. You must also design and deploy container workloads in your Google Distributed Cloud (GDC) air-gapped universe resiliently. Kubernetes offers several powerful mechanisms to minimize downtime and provide highly available services even when facing infrastructure issues or during routine maintenance. The following topics are key strategies to consider for HA:
Maintain availability with replicas and autoscaling: ensure your application has enough running instances to handle its latency demand and survive individual pod failures.
Pod replicas: maintain a stable set of identical pod replicas. If a pod fails, the
ReplicaSetcontroller automatically creates a new pod to replace it. For more information, see the KubernetesReplicaSetdocumentation.Horizontal Pod Autoscaler (HPA): automatically adjust the number of replicas based on metrics like CPU utilization, letting your application handle traffic spikes. For more information, see the Kubernetes HPA documentation.
Minimize downtime with
PodDisruptionBudget(PDB): use a PDB to limit the number of pods that can be down simultaneously during voluntary disruptions, such as node maintenance. This strategy ensures a minimum level of availability. For more information, see the Kubernetes PDB documentation.Spread risk with anti-affinity rules: use pod anti-affinity to ensure that multiple replicas of the same application aren't scheduled on the same node. This strategy prevents a single infrastructure failure from taking down all replicas. For more information, see the Kubernetes affinity documentation.
Health checks with probes: implement liveness, readiness, and startup probes to help Kubernetes detect and recover from unhealthy pods. Probes ensure that traffic only reaches pods capable of handling it. For more information, see the Kubernetes probe documentation.
Stable endpoints with services: use Kubernetes
Serviceobjects to provide a stable IP address and DNS name for your pods, decoupling clients from individual pod lifecycles and providing internal load balancing. For more information, see the Kubernetes Services documentation.Graceful updates and rollbacks with deployments: use Kubernetes
Deploymentobjects to provide graceful updates and rollbacks for your container workloads. For more information, see Kubernetes Deployment documentation.Set resource requests and limits: define the CPU and memory resources required for your containers. These specifications help the Kubernetes scheduler effectively place pods to prevent resource contention from affecting availability. For more information, see the Kubernetes resource management documentation.