Set up Multi-Cluster Mesh Failover
This page shows you how to design and implement a high-availability traffic routing strategy using Cloud Service Mesh in a multi-cluster environment. The following table describes the expected behavior:
| Cluster State | Traffic Behavior |
|---|---|
| Both clusters healthy | 50% traffic to Cluster A, 50% to B |
| Cluster A becomes unavailable | 100% traffic to Cluster B |
| Cluster A recovers | Automatically restores 50/50 split |
Prerequisites
As a starting point, this guide assumes that you have already:
- Created two GKE clusters registered to the same fleet host project in two different regions configured for Cloud Service Mesh.
- Set up a multi-cluster mesh on Cloud Service Mesh.
- Istio control plane installed and configured in both clusters.
istio-ingressgatewaydeployed and exposed in at least one cluster (Cluster A).hello-worldapplication deployed in both clusters with sidecar injection enabled.
This lab uses the following regions:
- Cluster A:
europe-west1 - Cluster B:
us-central1
Set up multi-cluster mesh failover
Deploy and apply the public ingress gateway using the sample manifest from the Cloud Service Mesh repository:
cat <<EOF> istio-ingressgateway.yaml apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: public-gateway namespace: default spec: selector: istio: ingressgateway servers: - port: number: 80 name: http protocol: HTTP hosts: - '*' EOF kubectl apply -f istio-ingressgateway.yamlThis gateway exposes the
hello-worldservice externally.Create and apply a
VirtualServicein Cluster A to route traffic to thehello-worldservice:cat <<EOF> virtual-service.yaml apiVersion: networking.istio.io/v1beta1 kind: VirtualService metadata: name: hello-world namespace: default spec: hosts: - '*' gateways: - public-gateway http: - route: - destination: host: hello-world.default.svc.cluster.local EOF kubectl apply -f virtual-service.yamlThis configuration forwards HTTP requests from the gateway to the service.
Configure and apply a
DestinationRulefor locality-based failovercat <<EOF> destination-rule.yaml apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: name: hello-world namespace: default spec: host: hello-world.default.svc.cluster.local trafficPolicy: connectionPool: http: http2MaxRequests: 100 outlierDetection: consecutive5xxErrors: 1 interval: 1s baseEjectionTime: 30s maxEjectionPercent: 100 loadBalancer: localityLbSetting: enabled: true distribute: - from: europe-west1 to: europe-west1: 50 us-central1: 50 - from: us-central1 to: us-central1: 50 europe-west1: 50 EOF kubectl apply -f destination-rule.yaml
Note the following:
- The localityLbSetting under the DestinationRule enables even traffic split and automatic failover.
- maxEjectionPercent allows Istio to failover all traffic if every endpoint in a locality is unhealthy.
- distribute: ensures an even 50/50 split between the clusters, based on the source cluster's region.
- failover: is implicitly handled when one locality becomes unavailable — Istio routes 100% of traffic to the healthy region.
- outlierDetection: ejects failing endpoints after minimal error thresholds.
Validate
You can now validate this behavior by:
- Sending requests through the Ingress Gateway in Cluster A.
- Scaling down
hello-worldpods ineurope-west1to 0. - Observing traffic failover to
us-central1. - Scaling pods back up in
europe-west1and verifying traffic split resumes.