Troubleshoot managed workload identity authentication for GKE

This document describes resolutions for common managed workload identities errors.

If the Google Kubernetes Engine (GKE) workload Pod has failed to be deployed with the mounted certificates, use the following command to check the Pod status:

kubectl describe pod POD_NAME -n POD_NAMESPACE

The command output includes Pod events generated by the kubelet and the gke-spiffe-controller. Review these events for specific error messages, which might include one of the following:

PermissionDenied errors

This section describes errors that return a PermissionDenied RPC error code.

PRIVATE_CA_AUTHORIZATION_FAILURE

This error looks similar to the following:

Permission denied while issuing the certificate: failed to issue the certificate from the GKE Auth: rpc error: code = PermissionDenied desc = Permission 'privateca.caPools.get' denied on resource 'privateca.googleapis.com/projects/CA_POOL_PROJECT_NUMBER/locations/REGION/caPools/CA_POOL_ID' (or it may not exist). Ensure that the CaPool exists and you have authorized the Managed Workload Identity to request certificates from the CaPool.

The error occurred because the workload identity pool is missing the CA Service Workload Certificate Requester role (roles/privateca.workloadCertificateRequester) on the subordinate CA pool or because the CA pool does not exist.

To resolve this error, grant the role to the pool:

gcloud privateca pools add-iam-policy-binding SUBORDINATE_CA_POOL_ID \
   --project=CA_POOL_PROJECT_ID \
   --location=REGION \
   --role=roles/privateca.workloadCertificateRequester \
   --member="principal://iam.googleapis.com/projects/WIP_PROJECT_NUMBER/name/locations/global/workloadIdentityPools/TRUST_DOMAIN_NAME"

Replace the following:

  • SUBORDINATE_CA_POOL_ID: the subordinate CA pool ID
  • CA_POOL_PROJECT_ID: the project ID of the CA pool
  • REGION: the subordinate CA region
  • WIP_PROJECT_NUMBER: the project number of the workload identity pool project
  • TRUST_DOMAIN_NAME: the name of the trust domain. Depending on your pool type, format the name as follows:
    • Google-managed pool: PROJECT_ID.svc.id.goog
    • Self-managed pool: POOL_NAME.global.POOL_HOST_PROJECT_NUMBER.workload.id.goog

WORKLOAD_IDENTITY_NOT_FOUND, WORKLOAD_POOL_NOT_FOUND, or WORKLOAD_NAMESPACE_NOT_FOUND

This error looks similar to the following:

failed to issue the certificate from the GKE Auth: rpc error: code = PermissionDenied desc = Permission denied on resource 'projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_NAME/... (or it may not exist).

This error maps to WORKLOAD_IDENTITY_NOT_FOUND, WORKLOAD_POOL_NOT_FOUND, or WORKLOAD_NAMESPACE_NOT_FOUND. It occurs when the workload identity pool, workload identity pool namespace, or managed identity resource does not exist, or when it is disabled or deleted. Note that this error returns PermissionDenied instead of NotFound to avoid leaking resource inexistence.

To resolve this error, verify the existence and status of your workload identity pool and its sub-resources in IAM. Ensure they are correctly configured and not disabled or deleted.

FailedPrecondition errors

This section describes errors that return a FailedPrecondition RPC error code, which typically indicate missing or incorrect configuration.

WORKLOAD_IDENTITY_INVALID_CONFIGURATION

The Pod description can include either of the following messages:

failed to issue the certificate from the GKE Auth: rpc error: code = FailedPrecondition desc = There are no CaPools configured for certificate issuance. Ensure you have added certificate issuance configuration to the workload identity pool 'projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/WI_POOL_NAME', which contains at least one CaPool.

or

failed to issue the certificate from the GKE Auth: rpc error: code = FailedPrecondition desc = Missing Certificate issuance configuration in the Trust Domain. Ensure you have added certificate issuance configuration to the workload identity pool 'projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/WI_POOL_NAME' which contains at least one CaPool.

This error (WORKLOAD_IDENTITY_INVALID_CONFIGURATION) occurred because the workload identity pool wasn't configured with a certificate issuance configuration (CIC) or because the configured CIC doesn't contain at least one CA pool.

To resolve this error, create a CIC that contains at least one CA pool and use it to update the workload identity pool.

CERTIFICATE_AUTHORITY_NOT_FOUND

This error looks similar to the following:

failed to issue the certificate from the GKE Auth: rpc error: code = FailedPrecondition desc = Failed to request certificates from the CaPool... There are no enabled CAs in the CaPool. Ensure that there is at least one enabled Certificate Authority to issue a certificate.

This error (CERTIFICATE_AUTHORITY_NOT_FOUND) indicates that a Certificate Authority cannot be found for certificate issuance in the customer configured CA pool.

To resolve this error, verify that a Certificate Authority exists in the configured CA pool and is enabled for certificate issuance.

CA_POOL_REGION_MISMATCH

This error looks similar to the following:

failed to issue the certificate from the GKE Auth: rpc error: code = FailedPrecondition desc = Unable to find a CaPool in the workload's region. Ensure you have setup a subordinate CaPool in 'WORKLOAD_REGION' and added it to the certificate issuance configuration of the Workload Identity Pool...

This error (CA_POOL_REGION_MISMATCH) indicates that no CA pool was configured for the workload's specific region.

To resolve this error, either configure an additional CA pool for the workload's region in the workload identity pool or deploy the workload in a region that already has a CA pool configured.

ResourceExhausted errors

This section describes errors that return a ResourceExhausted RPC error code.

PRIVATE_CA_QUOTA_EXCEEDED

This error looks similar to the following:

failed to issue the certificate from the GKE Auth: rpc error: code = ResourceExhausted desc = Quota exceeded for quota metric 'QUOTA_METRIC' and limit 'QUOTA_LIMIT' of service 'privateca.googleapis.com' for consumer 'project_number:PROJECT_NUMBER'.

This error (PRIVATE_CA_QUOTA_EXCEEDED) occurs when you configure your workload identity pool to use a custom CA, and an attempt to issue a certificate has exceeded an established quota or limit related to CA Service. This error does not occur if you use the Google-managed default CA.

  • QUOTA_METRIC and QUOTA_LIMIT: The specific quota metric and limit that were exceeded. For example, privateca.googleapis.com/enterprise_certificate_issuance and CertsPerEnterpriseCaPerMinute.

To resolve this error, review your CA Service quotas and limits in the Google Cloud console, and request a quota increase if necessary.

InvalidArgument errors

This section describes errors that return an InvalidArgument RPC error code.

PRIVATE_CA_KEY_ALGORITHM_MISMATCH

This error looks similar to the following:

failed to issue the certificate from the GKE Auth: rpc error: code = InvalidArgument desc = Public key algorithm is not permitted by the CaPool's issuance policy. Ensure that the requested keyAlgorithm 'KEY_ALGO_IN_CSR' is permitted by the CAs in the CaPool 'privateca.googleapis.com/projects/PROJECT_NAME/locations/WORKLOAD_REGION/caPools/CA_POOL_NAME'.

This error (PRIVATE_CA_KEY_ALGORITHM_MISMATCH) indicates that the key algorithm specified in the certificate request is incompatible with the configured CA pool's restrictions. This only happens when you configure a custom key algorithm in the certificate issuance configuration (CIC) and update the workload identity pool.

To resolve this error, verify that the algorithm configured in the CIC of the workload identity pool is compatible with the algorithms supported by the CA pool's issuance policy.

Unknown errors

This section describes errors that return an Unknown RPC error code.

UNKNOWN_PRIVATE_CA_CLIENT_ERROR

This error looks similar to the following:

failed to issue the certificate from the GKE Auth: rpc error: code = Unknown desc = Failed to get certificates using the CaPool... PRIVATE_CA_ERROR_MESSAGE

This error (UNKNOWN_PRIVATE_CA_CLIENT_ERROR) indicates that an unknown client error, such as an invalid argument provided to the API, has occurred when calling CA Service.

To resolve this error, review the specific error message returned from the CA Service API to identify the invalid argument or client-side configuration issue and correct it.

Trust bundle propagation latency

When you update or rotate a self-managed CA, there is a propagation latency of approximately 5 minutes before the updates take effect on the "GKE clusters". This is because the gke-spiffe-controller checks membership and fetches trust bundles every five minutes.

This latency does not occur if you use the Google-managed CA.

If you update a self-managed CA, wait at least five minutes for the changes to take effect.