Set up a multi-cluster mesh on GKE
This guide explains how to join two clusters into a single Cloud Service Mesh using Mesh CA or Istio CA, and enable cross-cluster load balancing. You can easily extend this process to incorporate any number of clusters into your mesh.
A multi-cluster Cloud Service Mesh configuration can solve several crucial enterprise scenarios, such as scale, location, and isolation. For more information, see Multi-cluster use cases.
Prerequisites
This guide assumes that you have two or more Google Cloud GKE clusters that meet the following requirements:
Cloud Service Mesh version 1.11 or later installed on the clusters using
asmcli install. You needasmcli, theistioctltool, and samples thatasmclidownloads to the directory that you specified in--output_dirwhen you ranasmcli installIf need to get set up, follow the steps in Install dependent tools and validate cluster to:Clusters in your mesh must have connectivity between all pods before you configure Cloud Service Mesh. Additionally, if you join clusters that are not in the same project, they must be registered to the same fleet host project, and the clusters must be in a shared VPC configuration together on the same network. We also recommend that you have one project to host the shared VPC, and two service projects for creating clusters. For more information, see Setting up clusters with Shared VPC.
If you use Istio CA, use the same custom root certificate for both clusters.
If your Cloud Service Mesh is built on private clusters, we recommend creating a single subnet in the same VPC, otherwise, you must ensure that:
- The control planes can reach the remote private cluster control planes via the cluster private IPs.
- You can add the calling control planes' IP ranges to the remote private clusters' authorized networks. For more information, see Configure endpoint discovery between private clusters.
The API server must be reachable by the other instances of the Cloud Service Mesh control plane in the multi-cluster mesh.
- Ensure the clusters have global access enabled.
- Ensure the Cloud Service Mesh control plane IP has been properly allowed via the allow list with the Master Authorized Network.
Setting project and cluster variables
Create the following environment variables for the project ID, cluster zone or region, cluster name, and context.
export PROJECT_1=PROJECT_ID_1 export LOCATION_1=CLUSTER_LOCATION_1 export CLUSTER_1=CLUSTER_NAME_1 export CTX_1="gke_${PROJECT_1}_${LOCATION_1}_${CLUSTER_1}" export PROJECT_2=PROJECT_ID_2 export LOCATION_2=CLUSTER_LOCATION_2 export CLUSTER_2=CLUSTER_NAME_2 export CTX_2="gke_${PROJECT_2}_${LOCATION_2}_${CLUSTER_2}"If these are newly created clusters, ensure to fetch credentials for each cluster with the following
gcloudcommands otherwise their associatedcontextwill not be available for use in the next steps of this guide.The commands depend on your cluster type, either regional or zonal:
Regional
gcloud container clusters get-credentials ${CLUSTER_1} --region ${LOCATION_1} gcloud container clusters get-credentials ${CLUSTER_2} --region ${LOCATION_2}Zonal
gcloud container clusters get-credentials ${CLUSTER_1} --zone ${LOCATION_1} gcloud container clusters get-credentials ${CLUSTER_2} --zone ${LOCATION_2}
Create firewall rule
In some cases, you need to create a firewall rule to allow cross-cluster traffic. For example, you need to create a firewall rule if:
- You use different subnets for the clusters in your mesh.
- Your Pods open ports other than 443 and 15002.
GKE automatically adds firewall rules to each node to allow traffic within the same subnet. If your mesh contains multiple subnets, you must explicitly set up the firewall rules to allow cross-subnet traffic. You must add a new firewall rule for each subnet to allow the source IP CIDR blocks and targets ports of all the incoming traffic.
The following instructions allow communication between all clusters in your
project or only between $CLUSTER_1 and $CLUSTER_2.
Gather information about your clusters' network.
All project clusters
If the clusters are in the same project, you can use the following command to allow communication between all clusters in your project. If there are clusters in your project that you don't want to expose, use the command in the Specific clusters tab.
function join_by { local IFS="$1"; shift; echo "$*"; } ALL_CLUSTER_CIDRS=$(gcloud container clusters list --project $PROJECT_1 --format='value(clusterIpv4Cidr)' | sort | uniq) ALL_CLUSTER_CIDRS=$(join_by , $(echo "${ALL_CLUSTER_CIDRS}")) ALL_CLUSTER_NETTAGS=$(gcloud compute instances list --project $PROJECT_1 --format='value(tags.items.[0])' | sort | uniq) ALL_CLUSTER_NETTAGS=$(join_by , $(echo "${ALL_CLUSTER_NETTAGS}"))Specific clusters
The following command allows communication between
$CLUSTER_1and$CLUSTER_2and doesn't expose other clusters in your project.function join_by { local IFS="$1"; shift; echo "$*"; } ALL_CLUSTER_CIDRS=$(for P in $PROJECT_1 $PROJECT_2; do gcloud --project $P container clusters list --filter="name:($CLUSTER_1,$CLUSTER_2)" --format='value(clusterIpv4Cidr)'; done | sort | uniq) ALL_CLUSTER_CIDRS=$(join_by , $(echo "${ALL_CLUSTER_CIDRS}")) ALL_CLUSTER_NETTAGS=$(for P in $PROJECT_1 $PROJECT_2; do gcloud --project $P compute instances list --filter="name:($CLUSTER_1,$CLUSTER_2)" --format='value(tags.items.[0])' ; done | sort | uniq) ALL_CLUSTER_NETTAGS=$(join_by , $(echo "${ALL_CLUSTER_NETTAGS}"))Create the firewall rule.
GKE
gcloud compute firewall-rules create istio-multicluster-pods \ --allow=tcp,udp,icmp,esp,ah,sctp \ --direction=INGRESS \ --priority=900 \ --source-ranges="${ALL_CLUSTER_CIDRS}" \ --target-tags="${ALL_CLUSTER_NETTAGS}" --quiet \ --network=YOUR_NETWORKAutopilot
TAGS="" for CLUSTER in ${CLUSTER_1} ${CLUSTER_2} do TAGS+=$(gcloud compute firewall-rules list --filter="Name:$CLUSTER*" --format="value(targetTags)" | uniq) && TAGS+="," done TAGS=${TAGS::-1} echo "Network tags for pod ranges are $TAGS" gcloud compute firewall-rules create asm-multicluster-pods \ --allow=tcp,udp,icmp,esp,ah,sctp \ --network=gke-cluster-vpc \ --direction=INGRESS \ --priority=900 --network=VPC_NAME \ --source-ranges="${ALL_CLUSTER_CIDRS}" \ --target-tags=$TAGS
Configure endpoint discovery
The steps required to configure endpoint discovery depend on whether you prefer to use the declarative API across clusters in a fleet, or enable it manually on public clusters or private clusters.
Enable endpoint discovery between clusters with declarative API (preview)
You can enable endpoint discovery across clusters in a fleet by applying the
config "multicluster_mode":"connected" in the in the asm-options configmap.
Clusters with this config enabled in the same fleet will have cross-cluster
service discovery automatically enabled between each other.
This method is available for managed Cloud Service Mesh installations on all release channels. Before proceeding, you must have created a firewall rule.
For multiple projects, you must manually add
FLEET_PROJECT_ID.svc.id.goog to trustDomainAliases in the
revison's meshConfig if it's not already present.
Enable
If the asm-options configmap already exists in your cluster, then enable
endpoint discovery for the cluster:
kubectl patch configmap/asm-options -n istio-system --type merge -p '{"data":{"multicluster_mode":"connected"}}'
If the asm-options configmap does not yet exist in your cluster, then
create it with the associated data and enable endpoint discovery for the
cluster:
kubectl --context ${CTX_1} create configmap asm-options -n istio-system --from-file <(echo '{"data":{"multicluster_mode":"connected"}}')
Disable
Disable endpoint discovery for a cluster:
kubectl patch configmap/asm-options -n istio-system --type merge -p '{"data":{"multicluster_mode":"manual"}}'
If you unregister a cluster from the fleet without disabling endpoint discovery, secrets could remain in the cluster. You must manually clean up any remaining secrets.
Run the following command to find secrets requiring cleanup:
kubectl get secrets -n istio-system -l istio.io/owned-by=mesh.googleapis.com,istio/multiCluster=trueDelete each secret:
kubectl delete secret SECRET_NAMERepeat this step for each remaining secret.
Configure endpoint discovery between public clusters
To configure endpoint discovery between GKE clusters, you run
asmcli create-mesh. This command:
- Registers all clusters to the same fleet.
- Configures the mesh to trust the fleet workload identity.
- Creates remote secrets.
You can either specify the URI for each cluster or the path the kubeconfig file.
Cluster URI
In the following command, replace FLEET_PROJECT_ID with
the project ID of the fleet host project and the cluster URI with the
cluster name, zone or region, and project ID for each cluster.
This example only shows two clusters, but you can run the command to enable
endpoint discovery on additional clusters, subject to the
maximum permitted number of clusters that you can add to your fleet.
./asmcli create-mesh \
FLEET_PROJECT_ID \
${PROJECT_1}/${LOCATION_1}/${CLUSTER_1} \
${PROJECT_2}/${LOCATION_2}/${CLUSTER_2}
kubeconfig file
In the following command, replace FLEET_PROJECT_ID with
the project ID of the
fleet host project
and PATH_TO_KUBECONFIG with the path to each
kubeconfig file. This example only shows two clusters, but you can run the
command to enable endpoint discovery on additional clusters, subject to the
maximum permitted number of clusters that you can add to your fleet.
./asmcli create-mesh \
FLEET_PROJECT_ID \
PATH_TO_KUBECONFIG_1 \
PATH_TO_KUBECONFIG_2
Configure endpoint discovery between private clusters
Configure remote secrets to allow API server access to the cluster to the other cluster's Cloud Service Mesh control plane. The commands depend on your Cloud Service Mesh type (either in-cluster or managed):
A. For in-cluster Cloud Service Mesh, you must configure the private IPs instead of public IPs because the public IPs are not accessible:
PRIV_IP=`gcloud container clusters describe "${CLUSTER_1}" --project "${PROJECT_1}" \ --zone "${LOCATION_1}" --format "value(privateClusterConfig.privateEndpoint)"` ./istioctl x create-remote-secret --context=${CTX_1} --name=${CLUSTER_1} --server=https://${PRIV_IP} > ${CTX_1}.secretPRIV_IP=`gcloud container clusters describe "${CLUSTER_2}" --project "${PROJECT_2}" \ --zone "${LOCATION_2}" --format "value(privateClusterConfig.privateEndpoint)"` ./istioctl x create-remote-secret --context=${CTX_2} --name=${CLUSTER_2} --server=https://${PRIV_IP} > ${CTX_2}.secretB. For Managed Cloud Service Mesh:
PUBLIC_IP=`gcloud container clusters describe "${CLUSTER_1}" --project "${PROJECT_1}" \ --zone "${LOCATION_1}" --format "value(privateClusterConfig.publicEndpoint)"` ./istioctl x create-remote-secret --context=${CTX_1} --name=${CLUSTER_1} --server=https://${PUBLIC_IP} > ${CTX_1}.secretPUBLIC_IP=`gcloud container clusters describe "${CLUSTER_2}" --project "${PROJECT_2}" \ --zone "${LOCATION_2}" --format "value(privateClusterConfig.publicEndpoint)"` ./istioctl x create-remote-secret --context=${CTX_2} --name=${CLUSTER_2} --server=https://${PUBLIC_IP} > ${CTX_2}.secretApply the new secrets into the clusters:
kubectl apply -f ${CTX_1}.secret --context=${CTX_2}kubectl apply -f ${CTX_2}.secret --context=${CTX_1}
Configure authorized networks for private clusters
Follow this section only if all of the following conditions apply to your mesh:
- You are using private clusters.
- The clusters do not belong to the same subnet.
- The clusters have enabled authorized networks.
When deploying multiple private clusters, the Cloud Service Mesh control plane in each cluster needs to call the GKE control plane of the remote clusters. To allow traffic, you need to add the Pod address range in the calling cluster to the authorized networks of the remote clusters.
Get the Pod IP CIDR block for each cluster:
POD_IP_CIDR_1=`gcloud container clusters describe ${CLUSTER_1} --project ${PROJECT_1} --zone ${LOCATION_1} \ --format "value(ipAllocationPolicy.clusterIpv4CidrBlock)"`POD_IP_CIDR_2=`gcloud container clusters describe ${CLUSTER_2} --project ${PROJECT_2} --zone ${LOCATION_2} \ --format "value(ipAllocationPolicy.clusterIpv4CidrBlock)"`Add the Kubernetes cluster Pod IP CIDR blocks to the remote clusters:
EXISTING_CIDR_1=`gcloud container clusters describe ${CLUSTER_1} --project ${PROJECT_1} --zone ${LOCATION_1} \ --format "value(masterAuthorizedNetworksConfig.cidrBlocks.cidrBlock)"` gcloud container clusters update ${CLUSTER_1} --project ${PROJECT_1} --zone ${LOCATION_1} \ --enable-master-authorized-networks \ --master-authorized-networks ${POD_IP_CIDR_2},${EXISTING_CIDR_1//;/,}EXISTING_CIDR_2=`gcloud container clusters describe ${CLUSTER_2} --project ${PROJECT_2} --zone ${LOCATION_2} \ --format "value(masterAuthorizedNetworksConfig.cidrBlocks.cidrBlock)"` gcloud container clusters update ${CLUSTER_2} --project ${PROJECT_2} --zone ${LOCATION_2} \ --enable-master-authorized-networks \ --master-authorized-networks ${POD_IP_CIDR_1},${EXISTING_CIDR_2//;/,}For more information, see Creating a cluster with authorized networks.
Verify that the authorized networks are updated:
gcloud container clusters describe ${CLUSTER_1} --project ${PROJECT_1} --zone ${LOCATION_1} \ --format "value(masterAuthorizedNetworksConfig.cidrBlocks.cidrBlock)"gcloud container clusters describe ${CLUSTER_2} --project ${PROJECT_2} --zone ${LOCATION_2} \ --format "value(masterAuthorizedNetworksConfig.cidrBlocks.cidrBlock)"
Enable control plane global access
Follow this section only if all of the following conditions apply to your mesh:
- You are using private clusters.
- You use different regions for the clusters in your mesh.
You must enable control plane global access to allow Cloud Service Mesh control plane in each cluster to call the GKE control plane of the remote clusters.
Enable control plane global access:
gcloud container clusters update ${CLUSTER_1} --project ${PROJECT_1} --zone ${LOCATION_1} \ --enable-master-global-accessgcloud container clusters update ${CLUSTER_2} --project ${PROJECT_2} --zone ${LOCATION_2} \ --enable-master-global-accessVerify that control plane global access in enabled:
gcloud container clusters describe ${CLUSTER_1} --project ${PROJECT_1} --zone ${LOCATION_1}gcloud container clusters describe ${CLUSTER_2} --project ${PROJECT_2} --zone ${LOCATION_2}The
privateClusterConfigsection in the output displays the status ofmasterGlobalAccessConfig.
Verify multicluster connectivity
This section explains how to deploy the sample HelloWorld and Sleep services
to your multi-cluster environment to verify that cross-cluster load
balancing works.
Set variable for samples directory
Navigate to where
asmcliwas downloaded, and run the following command to setASM_VERSIONexport ASM_VERSION="$(./asmcli --version)"Set a working folder to the samples that you use to verify that cross-cluster load balancing works. The samples are located in a subdirectory in the
--output_dirdirectory that you specified in theasmcli installcommand. In the following command, changeOUTPUT_DIRto the directory that you specified in--output_dir.export SAMPLES_DIR=OUTPUT_DIR/istio-${ASM_VERSION%+*}
Enable sidecar injection
Create the sample namespace in each cluster.
for CTX in ${CTX_1} ${CTX_2} do kubectl create --context=${CTX} namespace sample doneEnable the namespace for injection. The steps depend on your control plane implementation.
Managed (TD)
- Apply the default injection label to the namespace:
for CTX in ${CTX_1} ${CTX_2} do kubectl label --context=${CTX} namespace sample \ istio.io/rev- istio-injection=enabled --overwrite doneManaged (Istiod)
Recommended: Run the following command to apply the default injection label to the namespace:
for CTX in ${CTX_1} ${CTX_2} do kubectl label --context=${CTX} namespace sample \ istio.io/rev- istio-injection=enabled --overwrite doneIf you are an existing user with the Managed Istiod control plane: We recommend that you use default injection, but revision-based injection is supported. Use the following instructions:
Run the following command to locate the available release channels:
kubectl -n istio-system get controlplanerevisionThe output is similar to the following:
NAME AGE asm-managed-rapid 6d7hIn the output, the value under the
NAMEcolumn is the revision label that corresponds to the available release channel for the Cloud Service Mesh version.Apply the revision label to the namespace:
for CTX in ${CTX_1} ${CTX_2} do kubectl label --context=${CTX} namespace sample \ istio-injection- istio.io/rev=REVISION_LABEL --overwrite done
Install the HelloWorld service
Create the HelloWorld service in both clusters:
kubectl create --context=${CTX_1} \ -f ${SAMPLES_DIR}/samples/helloworld/helloworld.yaml \ -l service=helloworld -n samplekubectl create --context=${CTX_2} \ -f ${SAMPLES_DIR}/samples/helloworld/helloworld.yaml \ -l service=helloworld -n sample
Deploy HelloWorld v1 and v2 to each cluster
Deploy
HelloWorld v1toCLUSTER_1andv2toCLUSTER_2, which helps later to verify cross-cluster load balancing:kubectl create --context=${CTX_1} \ -f ${SAMPLES_DIR}/samples/helloworld/helloworld.yaml \ -l version=v1 -n samplekubectl create --context=${CTX_2} \ -f ${SAMPLES_DIR}/samples/helloworld/helloworld.yaml \ -l version=v2 -n sampleConfirm
HelloWorld v1andv2are running using the following commands. Verify that the output is similar to that shown.:kubectl get pod --context=${CTX_1} -n sampleNAME READY STATUS RESTARTS AGE helloworld-v1-86f77cd7bd-cpxhv 2/2 Running 0 40s
kubectl get pod --context=${CTX_2} -n sampleNAME READY STATUS RESTARTS AGE helloworld-v2-758dd55874-6x4t8 2/2 Running 0 40s
Deploy the Sleep service
Deploy the
Sleepservice to both clusters. This pod generates artificial network traffic for demonstration purposes:for CTX in ${CTX_1} ${CTX_2} do kubectl apply --context=${CTX} \ -f ${SAMPLES_DIR}/samples/sleep/sleep.yaml -n sample doneWait for the
Sleepservice to start in each cluster. Verify that the output is similar to that shown:kubectl get pod --context=${CTX_1} -n sample -l app=sleepNAME READY STATUS RESTARTS AGE sleep-754684654f-n6bzf 2/2 Running 0 5s
kubectl get pod --context=${CTX_2} -n sample -l app=sleepNAME READY STATUS RESTARTS AGE sleep-754684654f-dzl9j 2/2 Running 0 5s
Verify cross-cluster load balancing
Call the HelloWorld service several times and check the output to verify
alternating replies from v1 and v2:
Call the
HelloWorldservice:kubectl exec --context="${CTX_1}" -n sample -c sleep \ "$(kubectl get pod --context="${CTX_1}" -n sample -l \ app=sleep -o jsonpath='{.items[0].metadata.name}')" \ -- /bin/sh -c 'for i in $(seq 1 20); do curl -sS helloworld.sample:5000/hello; done'The output is similar to that shown:
Hello version: v2, instance: helloworld-v2-758dd55874-6x4t8 Hello version: v1, instance: helloworld-v1-86f77cd7bd-cpxhv ...
Call the
HelloWorldservice again:kubectl exec --context="${CTX_2}" -n sample -c sleep \ "$(kubectl get pod --context="${CTX_2}" -n sample -l \ app=sleep -o jsonpath='{.items[0].metadata.name}')" \ -- /bin/sh -c 'for i in $(seq 1 20); do curl -sS helloworld.sample:5000/hello; done'The output is similar to that shown:
Hello version: v2, instance: helloworld-v2-758dd55874-6x4t8 Hello version: v1, instance: helloworld-v1-86f77cd7bd-cpxhv ...
Congratulations, you've verified your load-balanced, multi-cluster Cloud Service Mesh!
Clean up HelloWorld service
When you finish verifying load balancing, remove the HelloWorld and Sleep
service from your cluster.
kubectl delete ns sample --context ${CTX_1}
kubectl delete ns sample --context ${CTX_2}