Limitations
Standby nodes can't be used as readable replicas.
Even if the dataplane is healthy, if the AlloyDB Omni cluster manager is down for more than 90 seconds by default, an automatic failover occurs. This duration can be configured in Configure high availability specifications using the
HEALTHCHECK_PERIODandAUTOFAILOVER_TRIGGER_THRESHOLDvariables.
Configure high availability specification
To configure high availability, fill out the following information in your
DBCluster specification:
DBCluster:
metadata:
...
spec:
...
availability:
numberOfStandbys: NUMBER_OF_STANDBYS
enableAutoFailover: true
enableAutoHeal: true
replayReplicationSlotsOnStandbys: false
healthcheckPeriodSeconds: HEALTHCHECK_PERIOD
autoFailoverTriggerThreshold: AUTOFAILOVER_TRIGGER_THRESHOLD
autoHealTriggerThreshold: AUTOHEAL_TRIGGER_THRESHOLD
Replace the following variables:
NUMBER_OF_STANDBYS: number of standby nodes to set up. Setting this value to0disables high availability. The maximum value is5. If you're not sure how many standby nodes you need, start with2for high resiliency.(Optional)
HEALTHCHECK_PERIOD: number of seconds to wait between each health check. The default value is30. The minimum value is1. The maximum value is86400(one day).(Optional)
AUTOFAILOVER_TRIGGER_THRESHOLD: number of times the health check can fail before a failover occurs. The default value is3. The minimum value is0, but if the value is set to0, AlloyDB Omni uses the default value.An automatic failover occurs if the healthcheck fails
AUTOFAILOVER_TRIGGER_THRESHOLDtimes or forHEALTHCHECK_PERIOD * AUTOFAILOVER_TRIGGER_THRESHOLDseconds.(Optional)
AUTOHEAL_TRIGGER_THRESHOLD: number of times the health check can fail before auto-heal begins. The default value is3. The minimum value is0, but if the value is set to0, AlloyDB Omni uses the default value.An automatic recovery occurs if the healthcheck fails
AUTOHEAL_TRIGGER_THRESHOLDtimes or forHEALTHCHECK_PERIOD * AUTOHEAL_TRIGGER_THRESHOLDseconds.
Apply your DBCluster specification
To apply your configured DBCluster specification, run one of the following
command:
alloydbctl
alloydbctl apply -d "DEPLOYMENT_SPEC" -r "DBCLUSTER_SPECIFICATION"Replace the following variables:
DEPLOYMENT_SPEC: path to the deployment specification you created in Install AlloyDB Omni components.DBCLUSTER_SPECIFICATION: path to theDBClusterspecification you created in Create a cluster.
Ansible
ansible-playbook DBCLUSTER_PLAYBOOK -i "DEPLOYMENT_SPEC" \
-e resource_spec="DBCLUSTER_SPECIFICATION"Replace the following variables:
RESTORE_PLAYBOOK: path to the playbook that you created for yourDBClusterCRD.DEPLOYMENT_SPEC: path to the deployment specification you created in Install AlloyDB Omni components.DBCLUSTER_SPECIFICATION: path to theDBClusterspecification you created in Create a cluster.
Switchover to a standby instance
You can perform switchovers when you need to test your high availability setup or any other planned maintenance activities that require switching the primary and standby replica. Once the switchover occurs, the direction of replication and roles of the primary and standby are reversed.
Switchovers perform the following actions:
AlloyDB Omni orchestrator takes the primary offline.
AlloyDB Omni orchestrator promotes the standby to be the new primary.
AlloyDB Omni orchestrator converts the primary into a standby.
AlloyDB Omni starts the newly-converted standby.
Perform a switchover
To perform a switchover, complete the following steps:
Verify that your primary and standby instances are healthy.
Verify that the high availability
status.phaseisReady.alloydbctlalloydbctl get -d "DEPLOYMENT_SPEC" -t DBCluster -n DBCLUSTER_SPECIFICATION -o yamlReplace the following variables:
DEPLOYMENT_SPEC: path to the deployment specification you created in Install AlloyDB Omni components.DBCLUSTER_SPECIFICATION: name of yourDBClusterspecification that you defined in Create a cluster.
Ansible
ansible-playbook status.yaml -i DEPLOYMENT_SPEC -e resource_type=DBCluster \ -e resource_name=DBCLUSTER_SPECIFICATIONReplace the following variables:
DEPLOYMENT_SPEC: path to the deployment specification you created in Install AlloyDB Omni components.DBCLUSTER_SPECIFICATION: name of yourDBClusterspecification that you defined in Create a cluster.
Create a
Switchoverspecification using the following format:Switchover: metadata: name: SWITCHOVER_NAME spec: dbClusterRef: DBCLUSTER_NAME newPrimary: NEW_PRIMARY_NAMEReplace the following variables:
SWITCHOVER_NAME: name for thisSwitchoverspecification. For example,my-switchover-1. This name must be unique every time a switchover is performed.DBCLUSTER_NAME: name of your database cluster that you defined in Create a cluster.(Optional)
NEW_PRIMARY_NAME: name of the standbyDBClusterspecification that should be the new primary.
If you're using Ansible, create a playbook for your
Switchoverspecification.- name: SWITCHOVER_PLAYBOOK_NAME hosts: localhost vars: ansible_become: true ansible_user: ANSIBLE_USER ansible_ssh_private_key_file: ANSIBLE_SSH_PRIVATE_KEY_FILE roles: - role: google.alloydbomni_orchestrator.switchoverReplace the following variables:
SWITCHOVER_PLAYBOOK_NAME: name of your Ansible playbook. For example,My Switchover.ANSIBLE_USER: OS user that Ansible uses to log into your AlloyDB Omni nodes.ANSIBLE_SSH_PRIVATE_KEY_FILE: private key Ansible uses to connect to your AlloyDB Omni nodes using SSH.
Apply your
Switchoverspecification.alloydbctlalloydbctl apply -d "DEPLOYMENT_SPEC" -r "SWITCHOVER_SPECIFICATION"Replace the following variables:
DEPLOYMENT_SPEC: path to the deployment specification you created in Install AlloyDB Omni components.SWITCHOVER_SPECIFICATION: path to theSwitchoverspecification you created in step three.
Ansible
ansible-playbook SWITCHOVER_PLAYBOOK -i "DEPLOYMENT_SPEC" \ -e resource_spec="SWITCHOVER_SPECIFICATION"Replace the following variables:
SWITCHOVER_PLAYBOOK: path to the playbook that you created for yourSwitchoverCRD in step four.DEPLOYMENT_SPEC: path to the deployment specification you created in Install AlloyDB Omni components.SWITCHOVER_SPECIFICATION: path to theSwitchoverspecification you created in step three.
Load balancer for high availability
The load balancer (HAProxy) achieves high availability by pairing its nodes with Keepalived and a virtual IP. Keepalived utilizes the Virtual Router Redundancy Protocol (VRRP) to control a floating, virtual IP. Database client applications connect to this virtual IP instead of the database node's IP address.
In configurations where a dedicated load balancer isn't used, Keepalived is installed directly on the database nodes. In this scenario, high availability is achieved by dynamically assigning the virtual IP to the current primary node, ensuring seamless failover if the primary becomes unavailable.
To establish a stable election, Keepalived assigns VRRP priorities to the
database cluster nodes. The first load balancer node assumes the primary role
with a higher Keepalived priority— for example, 110. Subsequent nodes
act as secondaries with a lower priority— for example, 100.
To ensure that the virtual IP points to a healthy node, Keepalived runs
continues health checks every two seconds. This verifies the state of the
systemd HAProxy process. If the HAProxy service on the primary fails,
Keepalived migrates the virtual IP to a healthy secondary node.
If the database node's membership changes, HAProxy and Keepalived automatically points to the new active database nodes. The underlying routing configuration updates without dropping live client connections.
Configure the load balancer
To configure the virtual IP for the load balancer nodes, add the following
dbLoadBalancerOptions field to the primarySpec field in your DBCluster
specification:
DBCluster:
spec:
primarySpec:
...
dbLoadBalancerOptions:
onprem:
loadBalancerIP: "VIRTUAL_IP"
loadBalancerType: "internal"
loadBalancerInterface: "VIRTUAL_IP_INTERFACE"
Replace the following variables:
VIRTUAL_IP: static IP address used for the floating, virtual IP. Database client applications use the IP address defined here. TO ensure that Keepalived can broadcast gratuitous ARPs successfully, this IP address must be available; can't loopback; and in the case of on-premises, belongs to the same subnet as your primary node interfaces.VIRTUAL_IP_INTERFACE: network interface whereVIRTUAL_IPis configured. The default value iseth0.