This document explains how you can maintain a Spanner Omni deployment. Maintaining a Spanner Omni deployment includes routine health checks, decommissioning unhealthy elements, and replacing server nodes or faulty disks to ensure cluster stability and consistency. Managing database health protects your databases from unplanned outages and maintains the redundancy of the underlying Paxos consensus group.
Performing maintenance helps you:
Ensure high availability: Reprovision unhealthy virtual machines (VMs) or pods to maintain database server redundancy. This helps you keep your applications running if hardware fails.
Protect data safety and integrity: Decommission nodes with faulty disks to prevent storage failures from spreading to other parts of the database. This also ensures that disconnected servers don't record conflicting updates.
Before you begin
Before you perform maintenance on your Kubernetes deployment, you must do the following:
Create a Spanner Omni deployment. For more information, see Create a deployment on Kubernetes or Create a deployment on VMs.
Download and install the Spanner Omni CLI.
For deployments on Kubernetes, install Helm and create a Helm chart configuration.
Replace a server
You might need to replace a server in your deployment to resolve system or storage faults.
Replace a root server
To replace a root server, select the tab for your environment:
Kubernetes
To replace a root server in a Kubernetes deployment, perform the following steps. Although adding more root servers to an existing Kubernetes deployment isn't supported, you can replace existing root servers to resolve any irrecoverable errors that you experience.
Delete the server that you want to replace:
spanner deployment servers delete SERVER_ENDPOINT --zone=ZONEReplace the following:
SERVER_ENDPOINT: The server pod endpoint to delete, in the formatSERVER.pod.NAMESPACE:PORT—for example,spanner-a-1.pod.spanner-ns:15000. To find the server endpoints in your deployment, list the deployment servers by runningspanner deployment servers list --zone=ZONE.ZONE: The zone containing the server—for example,us-central1-a.
This step might take a few minutes depending on the volume of data on the server. Spanner Omni relocates the data from this server to other servers in the deployment. Ensure that the server deletion is complete before proceeding to the next step.
To track deletion progress, check the status of the server:
spanner deployment servers describe SERVER_ENDPOINT --zone=ZONEWait until the command returns a
NOT_FOUNDerror or indicates that the server is no longer registered.Delete the persistent volume claim (PVC) for the pod hosting the server:
kubectl delete pvc DATA_VOLUME_NAME -n NAMESPACEReplace the following:
DATA_VOLUME_NAME: The data volume name—for example,data-volume-spanner-a-1. To find the data volume name, list the PVCs in your namespace by runningkubectl get pvc -n NAMESPACE.NAMESPACE: The namespace of the deployment—for example,spanner-ns.
Delete the pod:
kubectl delete pod POD_NAME -n NAMESPACEReplace
POD_NAMEwith the name of the server pod to delete—for example,spanner-a-1.Kubernetes automatically starts a new server in a replacement pod and attaches a new PVC.
Add the new server to the deployment:
spanner deployment servers create SERVER_ENDPOINT --zone=ZONEVerify that arguments match the new pod endpoint and zone—for example, using
spanner-a-1.pod.spanner-ns:15000as the endpoint andus-central1-aas the zone.
VM
To replace a root server in a VM deployment, perform the following steps:
Delete the server that you want to replace:
spanner deployment servers delete SERVER_ENDPOINT --zone=ZONEReplace the following:
SERVER_ENDPOINT: The server IP address or hostname and port—for example,spanner-vm-1.example.com:15000. To find the server endpoints, list the deployment servers by runningspanner deployment servers list --zone=ZONE.ZONE: The zone containing the server—for example,us-central1-a.
This step might take a few minutes depending on the volume of data on the server. Spanner Omni relocates the data from this server to other servers in the deployment. Ensure that the server deletion is complete before proceeding to the next step.
To track deletion progress, check the status of the server:
spanner deployment servers describe SERVER_ENDPOINT --zone=ZONEWait until the command returns a
NOT_FOUNDerror or indicates that the server is no longer registered.Reprovision the server with clean storage as explained in Create a deployment for Spanner Omni on VMs:
spanner start \ --root \ --server-address=HOSTNAME \ --zone=ZONE \ --base-dir=BASE_DIRReplace the following:
HOSTNAME: The resolvable FQDN or hostname of the new VM—for example,spanner-vm-1.example.com.ZONE: The target zone—for example,us-central1-a.BASE_DIR: The path where data is stored—for example,./span-dir.
Add the new server to the deployment:
spanner deployment servers create SERVER_ENDPOINT --zone=ZONEVerify that arguments match the parameters of the new server—for example, using
spanner-vm-1.example.com:15000as the endpoint andus-central1-aas the zone.
Replace a non-root server
To replace a non-root server, select the tab for your environment:
Kubernetes
To replace a non-root server in a Kubernetes deployment, perform the following steps:
Delete the server that you want to replace:
spanner deployment servers delete SERVER_ENDPOINT --zone=ZONEReplace the following:
SERVER_ENDPOINT: The server pod endpoint to delete, in the formatSERVER.pod.NAMESPACE:PORT—for example,spanner-a-4.pod.spanner-ns:15000. To find the server endpoints in your deployment, list the deployment servers by runningspanner deployment servers list --zone=ZONE.ZONE: The zone containing the server—for example,us-central1-a.
This step might take a few minutes depending on the volume of data on the server. Spanner Omni relocates the data from this server to other servers in the deployment. Ensure that the server deletion is complete before proceeding to the next step.
To track deletion progress, check the status of the server:
spanner deployment servers describe SERVER_ENDPOINT --zone=ZONEWait until the command returns a
NOT_FOUNDerror or indicates that the server is no longer registered.Delete the persistent volume claim (PVC) for the pod hosting the server:
kubectl delete pvc DATA_VOLUME_NAME -n NAMESPACEReplace the following:
DATA_VOLUME_NAME: The data volume name—for example,data-volume-spanner-a-4. To find the data volume name, list the PVCs in your namespace by runningkubectl get pvc -n NAMESPACE.NAMESPACE: The namespace of the deployment—for example,spanner-ns.
Delete the pod:
kubectl delete pod POD_NAME -n NAMESPACEReplace
POD_NAMEwith the name of the server pod to delete—for example,spanner-a-4.Kubernetes automatically starts a new server in a replacement pod and attaches a new PVC. Spanner Omni automatically registers the new non-root server into the deployment.
VM
To replace a non-root server in a VM deployment, perform the following steps:
Delete the server that you want to replace:
spanner deployment servers delete SERVER_ENDPOINT --zone=ZONEReplace the following:
SERVER_ENDPOINT: The server IP address or hostname and port—for example,spanner-vm-4.example.com:15000. To find the server endpoints, list the deployment servers by runningspanner deployment servers list --zone=ZONE.ZONE: The zone containing the server—for example,us-central1-a.
This step might take a few minutes depending on the volume of data on the server. Spanner Omni relocates the data from this server to other servers in the deployment. Ensure that the server deletion is complete before proceeding to the next step.
To track deletion progress, check the status of the server:
spanner deployment servers describe SERVER_ENDPOINT --zone=ZONEWait until the command returns a
NOT_FOUNDerror or indicates that the server is no longer registered.Reprovision the non-root server with clean storage as explained in Add non-root servers. The server is automatically added to the deployment.