This page shows you how to investigate issues with cluster creation, upgrade, and resizing in Google Distributed Cloud.
Default logging behavior for gkectl and gkeadm
For gkectl and gkeadm it  is sufficient to use the default logging settings:
- For - gkectl, the default log file is- /home/ubuntu/.config/gke-on-prem/logs/gkectl-$(date).log, and the file is symlinked with the- logs/gkectl-$(date).logfile in the local directory where you run- gkectl.
- For - gkeadm, the default log file is- logs/gkeadm-$(date).login the local directory where you run- gkeadm.
- The default - -v5verbosity level covers all the log entries needed by the support team.
- The log file includes the command executed and the failure message. 
We recommend that you send the log file to the support team when you need help.
Specifying a non-default locations for log files
To specify a non-default location for the gkectl log file, use the
--log_file flag. The log file that you specify will not be symlinked with the
local directory.
To specify a non-default location for the gkeadm log file, use the
--log_file flag.
Locating Cluster API logs in the admin cluster
If a VM fails to start after the admin control plane has started, you can investigate the issue by inspecting the logs from the Cluster API controllers Pod in the admin cluster.
- Find the name of the Cluster API controllers Pod: - kubectl --kubeconfig ADMIN_CLUSTER_KUBECONFIG --namespace kube-system \ get pods | grep clusterapi-controllers
- View logs from the - vsphere-controller-manager. Start by specifying the Pod, but no container::- kubectl --kubeconfig ADMIN_CLUSTER_KUBECONFIG --namespace kube-system \ logs POD_NAME- The output tells you that you must specify a container, and it gives you the names of the containers in the Pod. For example: - ... a container name must be specified ..., choose one of: [clusterapi-controller-manager vsphere-controller-manager rbac-proxy] - Choose a container, and view its logs: - kubectl --kubeconfig ADMIN_CLUSTER_KUBECONFIG --namespace kube-system \ logs POD_NAME --container CONTAINER_NAME
Using govc to resolve issues with vSphere
You can use govc to investigate issues with vSphere. For example, you can
confirm permissions and access for your vCenter user accounts, and you can
collect vSphere logs.
Debugging using the bootstrap cluster's logs
During installation, Google Distributed Cloud creates a temporary bootstrap cluster. After a successful installation, Google Distributed Cloud deletes the bootstrap cluster, leaving you with your admin cluster and user cluster. Generally, you should have no reason to interact with the bootstrap cluster.
If you pass --cleanup-external-cliuster=false to gkectl create cluster,
then the bootstrap cluster does not get deleted, and you can use the bootstrap
cluster's logs to debug installation issues.
- Find the names of Pods running in the - kube-systemnamespace:- kubectl --kubeconfig /home/ubuntu/.kube/kind-config-gkectl get pods -n kube-system 
- View the logs for a Pod: - kubectl --kubeconfig /home/ubuntu/.kube/kind-config-gkectl -n kube-system get logs POD_NAME 
Debugging F5 BIG-IP issues using the internal kubeconfig file
After an installation, Google Distributed Cloud generates a kubeconfig file
named internal-cluster-kubeconfig-debug in the home directory of your admin
workstation. This kubeconfig file is identical to your admin cluster's
kubeconfig file, except that it points directly to the admin cluster's control
plane node, where the Kubernetes API server runs. You can use the
internal-cluster-kubeconfig-debug file to debug F5 BIG-IP issues.
Resizing a user cluster fails
If a resizing of a user cluster fails:
- Find the names of the MachineDeployments and the Machines: - kubectl --kubeconfig USER_CLUSTER_KUBECONFIG get machinedeployments --all-namespaces kubectl --kubeconfig USER_CLUSTER_KUBECONFIG get machines --all-namespaces 
- Describe a MachineDeployment to view its logs: - kubectl --kubeconfig USER_CLUSTER_KUBECONFIG describe machinedeployment MACHINE_DEPLOYMENT_NAME
- Check for errors on newly-created Machines: - kubectl --kubeconfig USER_CLUSTER_KUBECONFIG describe machine MACHINE_NAME
No addresses can be allocated for cluster resize
This issue occurs if there are not enough IP addresses available to resize a user cluster.
kubectl describe machine displays the following error:
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning Failed 9s (x13 over 56s) machineipam-controller ipam: no addresses can be allocated
To resolve this issue, Allocate more IP addresses for the cluster. Then, delete the affected Machine:
kubectl --kubeconfig USER_CLUSTER_KUBECONFIG delete machine MACHINE_NAMEGoogle Distributed Cloud creates a new Machine and assigns it one of the newly available IP addresses.
Sufficient number of IP addresses allocated, but Machine fails to register with cluster
This issue can occur if there is an IP address conflict. For example, an IP address you specified for a machine is being used for a load balancer.
To resolve this issue, update your cluster IP block file so that the machine addresses do not conflict with addresses specified in your cluster configuration file or your Seesaw IP block file.
Snapshot is created automatically when admin cluster creation or upgrade fails
If you attempt to create or upgrade an admin cluster, and that operation fails, Google Distributed Cloud takes an external snapshot of the bootstrap cluster, which is a transient cluster that is used to create or upgrade the admin cluster. Although this snapshot of the bootstrap cluster is similar to the snapshot taken by running the gkectl diagnose snapshot command on the admin cluster, it is instead automatically triggered. This snapshot of the  bootstrap cluster contains important debugging information for the admin cluster creation and upgrade process. You can provide this snapshot to Google Cloud Support if needed.
Health checks are run automatically when cluster upgrade fails
If you attempt to upgrade an admin or user cluster, and that operation fails, Google Distributed Cloud automatically runs the gkectl diagnose cluster command on the cluster.
To skip the automatic diagnosis, pass the --skip-diagnose-cluster flag to gkectl upgrade.
Upgrade process becomes stuck
Google Distributed Cloud, behind the scenes, uses the Kubernetes drain command during an upgrade. This drain procedure can be blocked by a Deployment with only one replica that has a PodDisruptionBudget (PDB) created for it with minAvailable: 1.
In that case, save the PDB, and remove it from the cluster before attempting the upgrade. You can then add the PDB back after the upgrade is complete.
Re-create missing user cluster kubeconfig file
You might want to re-create a user cluster kubeconfig file in a couple of situations:
- If you attempt to create a user cluster, and the creation operation fails, and you want to have its user cluster kubeconfig file.
- If the user cluster kubeconfig file is missing, such as after being deleted.
Run these commands to re-create the user cluster kubeconfig file:
KUBECONFIG_SECRET_NAME=$(kubectl --kubeconfig ADMIN_CLUSTER_KUBECONFIG get secrets -n USER_CLUSTER_NAME | grep admin-kubeconfig | cut -d' ' -f1)
kubectl --kubeconfig ADMIN_CLUSTER_KUBECONFIG get secrets -n USER_CLUSTER_NAME $KUBECONFIG_SECRET_NAME \
  -o jsonpath='{.data.kubeconfig\.conf}' | base64 -d | sed -r "s/kube-apiserver.*local\./USER_CLUSTER_VIP/" > new_user_kubeconfig
Replace the following:
- USER_CLUSTER_VIP: the user master VIP value.
- USER_CLUSTER_NAME: the user cluster name.
- ADMIN_CLUSTER_KUBECONFIG: the path of the kubeconfig file for your admin cluster.