This document provides guidance on troubleshooting issues with Cloud DNS in Google Distributed Cloud (GDC) air-gapped. It covers potential errors and problems that you might encounter when performing management operations on DNS configurations, and provides tips and suggestions for debugging issues with DNS resolution. The intended audience for this document is platform administrators and application operators responsible for managing DNS records within a project.
Troubleshoot Cloud DNS management operations
This section provides troubleshooting information for common issues encountered when performing Create, Read, Update, and Delete (CRUD) operations on Cloud DNS zones and record sets.
Basic troubleshooting
- Ensure that you have the correct IAM roles, as described in Prepare IAM permissions.
- If using
gdcloud, you can refresh your authentication token by runninggdcloud auth login.
Resource naming and validation (RFC 1123)
Issue: Creation or update fails with invalid argument or validation errors.
Details:
- Resource Names: Names for DNS zones and record sets must conform to RFC
- They must contain only lowercase alphanumeric characters or
-, and must start and end with an alphanumeric character.
- They must contain only lowercase alphanumeric characters or
- DNS Names: Values for domain names and Fully Qualified Domain Names (FQDNs) must be valid domain names.
- Enum Values: Values like visibility (
PUBLICorPRIVATE) and record types (e.g.,A,CNAME) are case-sensitive and must match the exact expected string.
Unable to delete non-empty zones
Issue: Cannot delete a managed zone; the operation is rejected.
Details: A Managed DNS Zone must be empty of resource record sets before it can be deleted. You must delete all records within the zone first.
Naming and domain conflicts
Issue: Failing to create a resource with an error indicating the name or domain is already in use.
Details:
- Kubernetes Resource Name: The name of the object (e.g., the name in
metadata.name) must be unique within the specific project namespace. - DNS Name (FQDN): The actual domain name (e.g.,
dnsNamefor zones or records) must be unique within a given visibility (PUBLIC or PRIVATE). You cannot create two zones or two records with the identical domain name in the same visibility tier.
Troubleshoot DNS resolution issues
This section helps platform administrators and application operators troubleshoot DNS resolution issues when accessing services hosted on Google Distributed Cloud.
Wait for propagation and clear DNS cache
When DNS records are created or updated, the changes might not be visible instantly. This delay is usually caused by:
- Reconciliation Time: The system needs time to process the request and update the DNS server configuration. This usually takes a few minutes.
- DNS Propagation Delay (Caching): DNS resolvers and clients cache records to speed up lookups. Clients might continue to use the old record until the Time to live (TTL) period has passed.
Recommendations:
- Wait: Allow sufficient time for reconciliation and cache expiration.
- Flush Cache: Clear your local DNS cache to force a fresh lookup.
Verify local resolver configuration
Ensure that your client is using the correct DNS servers. If you don't know the correct IP address, contact your Infrastructure Operations (IO) team to find the correct DNS server IP.
- Linux/macOS: Check the contents of
/etc/resolv.conf. - Windows: Run
ipconfig /alland check the DNS Servers listed.
Test resolution using standard tools
Use standard command-line tools to test resolution and identify where the failure occurs.
Basic Lookup:
dig <domain_name>or
nslookup <domain_name>Query a Specific DNS Server: If you know the IP of the public ManagedDNS server, query it directly to bypass local resolvers:
dig @<public_dns_server_ip> <domain_name>Query a Public Nameserver: To rule out local resolver issues, try querying a public resolver like Google DNS:
dig @8.8.8.8 <domain_name>
Analyze DNS response codes
When using dig or nslookup, check the status code in the response:
- NXDOMAIN: The domain name was not found. Check for typos or verify if the record exists.
- SERVFAIL: The server failed to process the query. This often indicates an issue on the server side or communication failure between DNS servers.
- NOERROR: The query was successful, but there might be no answers. Verify if the requested record type (e.g., A, CNAME) exists.
- REFUSED: The server refused to answer the query, likely due to policy or access control settings.
Check network connectivity
Ensure that network traffic to the DNS server is not blocked.
- DNS uses port 53 for both UDP and TCP.
Test connectivity using
nc(netcat):nc -zv <dns_server_ip> 53Verify that firewall rules allow traffic on port 53.
Flush DNS cache
Flushing the DNS cache removes all entries from your local DNS cache. This forces your operating system to query the DNS servers again for the domain name, ensuring you get the most up-to-date record rather than a stale cached one.
- Windows:
ipconfig /flushdns- What it does: Clears the DNS resolver cache maintained by the DNS Client service.
- macOS:
sudo killall -HUP mDNSResponder- What it does: Sends a hangup signal (
HUP) to themDNSResponderprocess, forcing it to reload and clear its cache.
- What it does: Sends a hangup signal (
- Linux (systemd-resolved):
sudo systemd-resolve --flush-cachesorsudo resolvectl flush-caches- What it does: Tells the
systemd-resolvedservice to discard its internal DNS cache.
- What it does: Tells the
Check resource status with API or gdcloud
If you are a tenant or have access to the GDC API, you can check the status of the resources directly:
Check a DNS zone:
kubectl describe manageddnszone <zone_name> -n <project_namespace>Check a Resource Record Set:
kubectl describe resourcerecordset <record_name> -n <project_namespace>Look for conditions and events that might indicate failures in the reconciliation process.
If you have access to the gdcloud command line, you can use the describe
sub-command as documented in
List DNS records
or List DNS zones.