This document describes how to set up an application-based health check to autoheal VMs in a managed instance group (MIG). It also describes how to do the following: use a health check without autohealing, remove a health check, view autohealing policy, and check the health state of each VM.
You can configure an application-based health check to verify that your application on a VM is responding as expected. If the health check that you configure detects that your application on a VM isn't responding, then the MIG marks that VM as unhealthy and repairs it by default. Repairing a VM based on an application-based health check is called autohealing.
You can also turn off autohealing in a MIG so that you can use a health check without triggering the repairs for unhealthy VMs.
To know more about repairs in a MIG, see About repairing VMs for high availability.
Before you begin
- 
  
  If you haven't already, set up authentication.
  Authentication verifies your identity for access to Google Cloud services and APIs. To run
  code or samples from a local development environment, you can authenticate to
  Compute Engine by selecting one of the following options:
  
    
    
      
    
  
    
    
      
    
  
    
    
      
    
  
 
  
 
  
    
      Select the tab for how you plan to use the samples on this page: ConsoleWhen you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication. gcloud- 
 
  
  
  
    
    
  
    
    
  
    
    
      
    
  
  
    
    
  
    
    
  
    
    
  
  
  
   
    
      Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command: gcloud initIf you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity. 
- Set a default region and zone.
 TerraformTo use the Terraform samples on this page in a local development environment, install and initialize the gcloud CLI, and then set up Application Default Credentials with your user credentials. Install the Google Cloud CLI. If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity. If you're using a local shell, then create local authentication credentials for your user account: gcloud auth application-default login You don't need to do this if you're using Cloud Shell. If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity. For more information, see Set up authentication for a local development environment. RESTTo use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI. Install the Google Cloud CLI. If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity. For more information, see Authenticate for using REST in the Google Cloud authentication documentation. 
- 
 
  
  
  
    
    
  
    
    
  
    
    
      
    
  
  
    
    
  
    
    
  
    
    
  
  
  
   
    
      
Pricing
When you set up an application-based health check, whenever a VM's health state changes, by default Compute Engine writes a log entry in Cloud Logging. Cloud Logging provides a free allotment per month after which logging is priced by data volume. To avoid costs, you can disable the health state change logs.
Set up an application-based health check and autohealing
To set up an application-based health check and autohealing in a MIG, you must do the following:
- Create a health check, if you haven't already.
- Configure an autohealing policy in the MIG to apply the health check.
Create a health check
You can apply a single health check to a maximum of 50 MIGs. If you have more than 50 groups, create multiple health checks.
The following example shows how to create a health check for autohealing. You
can create either a regional
or a global health check for
autohealing in MIGs. In this example, you create a global health check that
looks for a web server
response on port 80. To enable the health check probes to reach the web
server, configure a firewall rule.
Console
- Create a health check for autohealing that is more conservative than a load balancing health check. - For example, create a health check that looks for a response on port - 80and that can tolerate some failure before it marks VMs as- UNHEALTHYand causes them to be recreated. In this example, a VM is marked as healthy if the health check returns successfully once. The VM is marked as unhealthy if the health check returns unsuccessfully- 3consecutive times.- In the Google Cloud console, go to the Create a health check page. 
- Give the health check a name, such as - example-check.
- Select a Scope. You can select either Regional or Global. For this example, select Global. 
- For Protocol, make sure that HTTP is selected. 
- For Port, enter - 80.
- In the Health criteria section, provide the following values: - For Check interval, enter 5.
- For Timeout, enter 5.
- Set a Healthy threshold to determine how many consecutive
successful health checks must be returned before an unhealthy
VM is marked as healthy. Enter 1for this example.
- Set an Unhealthy threshold to determine how many consecutive
unsuccessful health checks must be returned before a healthy VM is
marked as unhealthy. Enter 3for this example.
 
- For Check interval, enter 
- Click Create to create the health check. 
 
- Create a firewall rule to allow health check probes to connect to your app. - Health check probes come from addresses in the ranges - 130.211.0.0/22and- 35.191.0.0/16, so make sure your network firewall rules allow the health check to connect. For this example, the MIG uses the- defaultnetwork and its VMs are listening on port- 80. If port- 80is not already open on the default network, create a firewall rule.- In the Google Cloud console, go to the Firewall policies page. 
- Click Create firewall rule. 
- Enter a Name for the firewall rule. For example, - allow-health-check.
- For Network, select the - defaultnetwork.
- For Targets, select - All instances in the network.
- For Source filter, select - IPv4 ranges.
- For Source IPv4 ranges, enter - 130.211.0.0/22and- 35.191.0.0/16.
- In Protocols and ports, select Specified protocols and ports and do the following: - Select TCP.
- In the Ports field, enter 80.
 
- Click Create. 
 
gcloud
- Create a health check for autohealing that is more conservative than a load balancing health check. - For example, create a health check that looks for a response on port - 80and that can tolerate some failure before it marks VMs as- UNHEALTHYand causes them to be recreated. In this example, VM is marked as healthy if it returns successfully once. The VM is marked as unhealthy if it returns unsuccessfully- 3consecutive times. The following command creates a global health check.- gcloud compute health-checks create http example-check --port 80 \ --check-interval 30s \ --healthy-threshold 1 \ --timeout 10s \ --unhealthy-threshold 3 \ --global 
- Create a firewall rule to allow health check probes to connect to your app. - Health check probes come from addresses in the ranges - 130.211.0.0/22and- 35.191.0.0/16, so make sure your firewall rules allow the health check to connect. For this example, the MIG uses the- defaultnetwork, and its VMs listen on port- 80. If port- 80isn't already open on the default network, create a firewall rule.- gcloud compute firewall-rules create allow-health-check \ --allow tcp:80 \ --source-ranges 130.211.0.0/22,35.191.0.0/16 \ --network default
Terraform
- Create a health check using the - google_compute_http_health_checkresource.- For example, create a health check that looks for a response on port - 80and that can tolerate some failure before it marks VMs as- UNHEALTHYand causes them to be recreated. In this example, a VM is marked as healthy if it returns successfully once. The VM is marked as unhealthy if it returns unsuccessfully- 3consecutive times. The following request creates a global health check.
- Create a firewall using the - google_compute_firewallresource.- Health check probes come from addresses in the ranges - 130.211.0.0/22and- 35.191.0.0/16, so make sure your firewall rules allow the health check to connect. For this example, the MIG uses the- defaultnetwork and its VMs are listening on port- 80. If port- 80is not already open on the default network, create a firewall rule.
To learn how to apply or remove a Terraform configuration, see Basic Terraform commands.
REST
- Create a health check for autohealing that is more conservative than a load balancing health check. - For example, create a health check that looks for a response on port - 80and that can tolerate some failure before it marks VMs as- UNHEALTHYand causes them to be recreated. In this example, a VM is marked as healthy if it returns successfully once. The VM is marked as unhealthy if it returns unsuccessfully- 3consecutive times. The following request creates a global health check.- POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/global/healthChecks { "name": "example-check", "type": "http", "port": 80, "checkIntervalSec": 30, "healthyThreshold": 1, "timeoutSec": 10, "unhealthyThreshold": 3 }
- Create a firewall rule to allow health check probes to connect to your app. - Health check probes come from addresses in the ranges - 130.211.0.0/22and- 35.191.0.0/16, so make sure your firewall rules allow the health check to connect. For this example, the MIG uses the- defaultnetwork and its VMs are listening on port- 80. If port- 80is not already open on the default network, create a firewall rule.- POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/global/firewalls { "name": "allow-health-check", "network": "https://www.googleapis.com/compute/v1/projects/PROJECT_ID/global/networks/default", "sourceRanges": [ "130.211.0.0/22", "35.191.0.0/16" ], "allowed": [ { "ports": [ "80" ], "IPProtocol": "tcp" } ] }- Replace - PROJECT_IDwith your project ID.
Configure an autohealing policy in a MIG
In a MIG, you can set up only one autohealing policy to apply a health check.
Before configuring an autohealing policy, if you don't have a health check already, then create one. You can use either a regional or a global health check for autohealing in MIGs. A regional health check reduce cross-region dependencies and help to achieve data residency whereas a global health check is convenient if you want to use the same health check for MIGs in multiple regions.
If you want to prevent inadvertently triggering autohealing while setting up a new health check or want to use a health check without autohealing, then see Configure health check without autohealing. You can also turn off autohealing after you configure a health check in the MIG.
To configure an autohealing policy, select one of the following options:
Console
- In the Google Cloud console, go to the Instance groups page. 
- Under the Name column of the list, click the name of the MIG in which you want to apply the health check. 
- Click Edit to modify this MIG. 
- Click Instance lifecycle and autohealing to expand the section.
- In the Autohealing section, for the Health check, select a global or a regional health check.
- For the Initial delay, use the default value or modify as
needed.
The initial delay is the number of seconds that a new VM takes to initialize and run its startup script. During a VM's initial delay period, the MIG ignores unsuccessful health checks because the VM might be in the startup process. This prevents the MIG from prematurely recreating a VM. If the health check receives a healthy response during the initial delay, it indicates that the startup process is complete and the VM is ready. The initial delay timer starts when the VM's currentActionfield changes toVERIFYING. The value of initial delay must be between 0 and 3600 seconds. In the console, the default value is 300 seconds.
 
- Click Save to apply your changes. 
gcloud
To configure autohealing policy in an existing MIG, use the
update command. For example, use the following command to configure
autohealing policy in an
existing zonal MIG:
gcloud compute instance-groups managed update MIG_NAME \
    --health-check HEALTH_CHECK_URL \
    --initial-delay INITIAL_DELAY \
    --zone ZONE
To configure autohealing policy when creating a MIG, use the
create command.
For example, use the following command to configure autohealing policy when
creating a zonal MIG:
gcloud compute instance-groups managed create MIG_NAME \
    --size SIZE \
    --template INSTANCE_TEMPLATE_URL \
    --health-check HEALTH_CHECK_URL \
    --initial-delay INITIAL_DELAY \
    --zone ZONE
Replace the following:
- MIG_NAME: The name of the MIG in which you want to set up autohealing.
- SIZE: The number of VMs in the group.
- INSTANCE_TEMPLATE_URL: the URL of the instance template that you want to use to create VMs in the MIG. The URL can contain either the ID or name of the instance template. Specify one of the following values:- For a regional instance template: projects/PROJECT_ID/regions/REGION/instanceTemplates/INSTANCE_TEMPLATE_ID
- For a global instance template: INSTANCE_TEMPLATE_ID
 
- For a regional instance template: 
- HEALTH_CHECK_URL: The partial URL of the health check that you want to set up for autohealing. For example:- Regional health check: projects/example-project/regions/us-central1/healthChecks/example-health-check.
- Global health check: projects/example-project/global/healthChecks/example-health-check.
 
- Regional health check: 
- INITIAL_DELAY: The number of seconds that a new VM takes to initialize and run its startup script. During a VM's initial delay period, the MIG ignores unsuccessful health checks because the VM might be in the startup process. This prevents the MIG from prematurely recreating a VM. If the health check receives a healthy response during the initial delay, it indicates that the startup process is complete and the VM is ready. The initial delay timer starts when the VM's- currentActionfield changes to- VERIFYING. The value of initial delay must be between- 0and- 3600seconds. The default value is- 0.
- ZONE: The zone where the MIG is located. For a regional MIG, use the- --regionflag.
Terraform
To configure an autohealing policy in a MIG, use the auto_healing_policies
block.
The following sample configures autohealing policy in a zonal MIG. For more
information about the resource used in the sample, see google_compute_instance_group_manager. For a
regional MIG, use the google_compute_region_instance_group_manager resource.
To learn how to apply or remove a Terraform configuration, see Basic Terraform commands.
REST
To configure autohealing policy in an existing MIG, use the patch method as
follows:
- For a zonal MIG, use the
instanceGroupManager.patchmethod.
- For a regional MIG, use the
regionInstanceGroupManager.patchmethod.
For example, make the following call to set up autohealing in an existing zonal MIG:
  PATCH https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers/MIG_NAME
  {
    "autoHealingPolicies": [
      {
        "healthCheck": "HEALTH_CHECK_URL",
        "initialDelaySec": INITIAL_DELAY
      }
    ]
  }
To configure autohealing policy when creating a MIG, use the insert
method as follows:
- For a zonal MIG, use the
instanceGroupManager.insertmethod.
- For a regional MIG, use the
regionInstanceGroupManager.insertmethod.
For example, make the following call to configure autohealing policy when creating a zonal MIG:
  POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers
  {
    "name": "MIG_NAME",
    "targetSize": SIZE,
    "instanceTemplate": "INSTANCE_TEMPLATE_URL",
    "autoHealingPolicies": [
      {
        "healthCheck": "HEALTH_CHECK_URL",
        "initialDelaySec": INITIAL_DELAY
      }
    ]
  }
Replace the following:
- PROJECT_ID: Your project ID.
- MIG_NAME: The name of the MIG in which you want to set up autohealing.
- SIZE: The number of VMs in the group.
- INSTANCE_TEMPLATE_URL: the URL of the instance template that you want to use to create VMs in the MIG. The URL can contain either the ID or name of the instance template. Specify one of the following values:- For a regional instance template: projects/PROJECT_ID/regions/REGION/instanceTemplates/INSTANCE_TEMPLATE_ID
- For a global instance template: INSTANCE_TEMPLATE_ID
 
- For a regional instance template: 
- HEALTH_CHECK_URL: The partial URL of the health check that you want to set up for autohealing. For example:- Regional health check: projects/example-project/regions/us-central1/healthChecks/example-health-check.
- Global health check: projects/example-project/global/healthChecks/example-health-check.
 
- Regional health check: 
- INITIAL_DELAY: The number of seconds that a new VM takes to initialize and run its startup script. During a VM's initial delay period, the MIG ignores unsuccessful health checks because the VM might be in the startup process. This prevents the MIG from prematurely recreating a VM. If the health check receives a healthy response during the initial delay, it indicates that the startup process is complete and the VM is ready. The initial delay timer starts when the VM's- currentActionfield changes to- VERIFYING. The value of initial delay must be between- 0and- 3600seconds. The default value is- 0.
- ZONE: The zone where the MIG is located. For a regional MIG, use- regions/REGIONin the URL.
After the autohealing setup is complete, it can take 10 minutes before autohealing begins monitoring VMs in the group. After the monitoring begins, Compute Engine begins to mark VMs as healthy (or else recreates them) based on your autohealing configuration. For example, if you configure an initial delay of 5 minutes, a health check interval of 1 minute, and a healthy threshold of 1 check, the timeline looks like the following:
- 10 minute delay before autohealing begins monitoring VMs in the group
- + 5 minutes for the configured initial delay
- + 1 minute for the check interval * healthy threshold (60s * 1)
- = 16 minutes before the VM is either marked as healthy or is recreated
Configure a health check without autohealing
You can turn off autohealing in a MIG and use the configured health check for monitoring your application health or you can implement your own repair logic. Turning off autohealing in a MIG doesn't affect the functioning of the health check. The health check continues to probe the application and provides the VM health states. However, the MIG will no longer repair unhealthy VMs.
To configure a health check without autohealing, select one of the following options.
Console
- In the Google Cloud console, go to the Instance groups page. 
- Under the Name column of the list, click the name of the MIG in which you want to apply the health check. 
- Click Edit to modify this MIG. 
- Click Instance lifecycle and autohealing to expand the section.
- In the Autohealing section, for the Health check, select a global or a regional health check.
- For the Initial delay, use the default value or modify as
needed.
The initial delay is the number of seconds that a new VM takes to initialize and run its startup script. During a VM's initial delay period, the MIG ignores unsuccessful health checks because the VM might be in the startup process. This prevents the MIG from prematurely recreating a VM. If the health check receives a healthy response during the initial delay, it indicates that the startup process is complete and the VM is ready. The initial delay timer starts when the VM's currentActionfield changes toVERIFYING. The value of initial delay must be between 0 and 3600 seconds. In the console, the default value is 300 seconds.
 
- In the On failed health check list, select No action.
- Click Save to apply your changes. 
gcloud
To configure a health check without autohealing, when you specify the health
check configuration you must also set the
--action-on-vm-failed-health-check flag to do-nothing as follows:
- In an existing MIG, use the beta - updatecommand.- For example, use the following command in an existing zonal MIG: - gcloud beta compute instance-groups managed update MIG_NAME \ --health-check HEALTH_CHECK_URL \ --initial-delay INITIAL_DELAY \ --action-on-vm-failed-health-check do-nothing \ --zone ZONE
- When creating a MIG, use the beta - createcommand.- For example, use the following command when creating a zonal MIG: - gcloud beta compute instance-groups managed create MIG_NAME \ --size SIZE \ --template INSTANCE_TEMPLATE_URL \ --health-check HEALTH_CHECK_URL \ --initial-delay INITIAL_DELAY \ --action-on-vm-failed-health-check do-nothing \ --zone ZONE
Replace the following:
- MIG_NAME: The name of the MIG in which you want to set up autohealing.
- SIZE: The number of VMs in the group.
- INSTANCE_TEMPLATE_URL: the URL of the instance template that you want to use to create VMs in the MIG. The URL can contain either the ID or name of the instance template. Specify one of the following values:- For a regional instance template: projects/PROJECT_ID/regions/REGION/instanceTemplates/INSTANCE_TEMPLATE_ID
- For a global instance template: INSTANCE_TEMPLATE_ID
 
- For a regional instance template: 
- HEALTH_CHECK_URL: The partial URL of the health check that you want to set up for autohealing. For example:- Regional health check: projects/example-project/regions/us-central1/healthChecks/example-health-check.
- Global health check: projects/example-project/global/healthChecks/example-health-check.
 
- Regional health check: 
- INITIAL_DELAY: The number of seconds that a new VM takes to initialize and run its startup script. During a VM's initial delay period, the MIG ignores unsuccessful health checks because the VM might be in the startup process. This prevents the MIG from prematurely recreating a VM. If the health check receives a healthy response during the initial delay, it indicates that the startup process is complete and the VM is ready. The initial delay timer starts when the VM's- currentActionfield changes to- VERIFYING. The value of initial delay must be between- 0and- 3600seconds. The default value is- 0.
- ZONE: The zone where the MIG is located. For a regional MIG, use the- --regionflag.
REST
To configure a health check without autohealing, when you specify the health
check configuration you must also set the
onFailedHealthCheck field to DO_NOTHING as follows:
- In an existing MIG, use the beta - patchmethod as follows:- For a zonal MIG, use the
beta instanceGroupManager.patchmethod.
- For a regional MIG, use the
beta regionInstanceGroupManager.patchmethod.
 - For example, make the following call in an existing zonal MIG: - PATCH https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers/MIG_NAME { "autoHealingPolicies": [ { "healthCheck": "HEALTH_CHECK_URL", "initialDelaySec": INITIAL_DELAY } ], "instanceLifecyclePolicy": { "onFailedHealthCheck": "DO_NOTHING" } }
- For a zonal MIG, use the
beta 
- When creating a MIG, use the beta - insertmethod as follows:- For a zonal MIG, use the
beta instanceGroupManager.insertmethod.
- For a regional MIG, use the
beta regionInstanceGroupManager.insertmethod.
 - For example, make the following call when creating a zonal MIG: - POST https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers { "name": "MIG_NAME", "targetSize": SIZE, "instanceTemplate": "INSTANCE_TEMPLATE_URL", "autoHealingPolicies": [ { "healthCheck": "HEALTH_CHECK_URL", "initialDelaySec": INITIAL_DELAY } ], "instanceLifecyclePolicy": { "onFailedHealthCheck": "DO_NOTHING" } }
- For a zonal MIG, use the
beta 
Replace the following:
- PROJECT_ID: Your project ID.
- MIG_NAME: The name of the MIG in which you want to set up autohealing.
- SIZE: The number of VMs in the group.
- INSTANCE_TEMPLATE_URL: the URL of the instance template that you want to use to create VMs in the MIG. The URL can contain either the ID or name of the instance template. Specify one of the following values:- For a regional instance template: projects/PROJECT_ID/regions/REGION/instanceTemplates/INSTANCE_TEMPLATE_ID
- For a global instance template: INSTANCE_TEMPLATE_ID
 
- For a regional instance template: 
- HEALTH_CHECK_URL: The partial URL of the health check that you want to set up for autohealing. For example:- Regional health check: projects/example-project/regions/us-central1/healthChecks/example-health-check.
- Global health check: projects/example-project/global/healthChecks/example-health-check.
 
- Regional health check: 
- INITIAL_DELAY: The number of seconds that a new VM takes to initialize and run its startup script. During a VM's initial delay period, the MIG ignores unsuccessful health checks because the VM might be in the startup process. This prevents the MIG from prematurely recreating a VM. If the health check receives a healthy response during the initial delay, it indicates that the startup process is complete and the VM is ready. The initial delay timer starts when the VM's- currentActionfield changes to- VERIFYING. The value of initial delay must be between- 0and- 3600seconds. The default value is- 0.
- ZONE: The zone where the MIG is located. For a regional MIG, use- regions/REGIONin the URL.
After configuring the health check, you can monitor the VM health states to confirm that the health check is working as expected. If you want the MIG to repair unhealthy VMs, you can turn on autohealing.
Remove a health check
You can remove a health check configured in an autohealing policy as follows:
Console
- In the Google Cloud console, go to the Instance groups page. 
- Click the name of the MIG from which you want to remove the health check. 
- Click Edit to modify this MIG. 
- Click Instance lifecycle and autohealing to expand the section. 
- In Autohealing section, for Health check, select No health check. 
- Click Save to apply the changes. 
gcloud
To remove the health check configuration in an autohealing policy, in the
update command
use the --clear-autohealing flag as follows:
gcloud compute instance-groups managed update MIG_NAME \
    --clear-autohealing
Replace MIG_NAME with the name of a MIG.
REST
To remove the health check configuration in an autohealing policy, set the autohealing policy to an empty value.
- For a zonal MIG, use the
instanceGroupManagers.patchmethod
- For a regional MIG, use the
regionInstanceGroupManagers.patchmethod
For example, to remove health check in a zonal MIG, make the following request:
PATCH https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers/MIG_NAME
{
  "autoHealingPolicies": [
    {}
  ]
}
Replace the following:
- PROJECT_ID: Your project ID.
- MIG_NAME: The name of the MIG in which you want to set up autohealing.
- ZONE: The zone where the MIG is located. For a regional MIG, use- regions/REGION.
View autohealing policy in a MIG
You can view the autohealing policy of a MIG as follows:
Console
- In the Google Cloud console, go to the Instance groups page. 
- Click the name of the MIG of which you want to view the autohealing policy. 
- Go to the Details tab. - The VM instance lifecycle section displays the health check and the initial delay configured in the autohealing policy. 
gcloud
To view the autohealing policy in a MIG, use the following command:
gcloud compute instance-groups managed describe MIG_NAME \
    --format="(autoHealingPolicies)"
Replace MIG_NAME with the name of a MIG.
The following is a sample output:
autoHealingPolicies: healthCheck: https://www.googleapis.com/compute/v1/projects/example-project/global/healthChecks/example-health-check initialDelaySec: 300
REST
To view the autohealing policy in a MIG, use the REST methods as follows:
- For a zonal MIG, use the instanceGroupManagers.getmethod
- For a regional MIG, use the regionInstanceGroupManagers.getmethod
For example, make the following request to view the autohealing policy in a zonal MIG:
GET https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers/MIG_NAME
In the response body, check for the autoHealingPolicies[] object.
The following is a sample response:
{
  ...
  "autoHealingPolicies": [
    {
      "healthCheck": "https://www.googleapis.com/compute/v1/projects/example-project/global/healthChecks/example-health-check",
      "initialDelaySec": 300
    }
  ],
  ...
}
Replace the following:
- PROJECT_ID: Your project ID.
- MIG_NAME: The name of the MIG in which you want to set up autohealing.
- ZONE: The zone where the MIG is located. For a regional MIG, use- regions/REGION.
Check the status
After you set up an application-based health check in a MIG, you can verify that a VM is running and its application is responding using the following ways:
Check whether VMs are healthy
If you have configured an application-based health check in your MIG, you can review the health state of each managed instance.
Inspect your managed instance health states to:
- Identify unhealthy VMs that are not being repaired. A VM might not
be repaired immediately even if it has been diagnosed as unhealthy in the
following situations:
- The VM is still booting, and its initial delay has not passed.
- A significant share of unhealthy instances is being repaired. The MIG delays further autohealing to ensure that the group keeps running a subset of instances.
 
- Detect health check configuration errors. For example, you can detect
misconfigured firewall rules or an invalid application health checking
endpoint if the instance reports a health state of TIMEOUT.
- Determine the initial delay value to configure by measuring the amount of time
between when the VM transitions to a RUNNINGstatus and when the VM transitions to aHEALTHYhealth state. You can measure this gap by polling thelist-instancesmethod or by observing the time betweeninstances.insertoperation and the first healthy signal received.
Use the
console, the
gcloud
command-line tool, or
REST
to view health states.
Console
- In the Google Cloud console, go to the Instance groups page. 
- Under the Name column of the list, click the name of the MIG that you want to examine. A page opens with the instance group properties and a list of VMs that are included in the group. 
- If a VM is unhealthy, you can see its health state in the Health check status column. 
gcloud
Use the list-instances
sub-command.
gcloud compute instance-groups managed list-instances MIG_NAME
    --zone ZONE
The command gives an output similar to the following. The HEALTH_STATE
field shows each VM's health state.
NAME: igm-with-hc-fvz6 ZONE: europe-west1-b STATUS: RUNNING HEALTH_STATE: HEALTHY ACTION: NONE INSTANCE_TEMPLATE: my-template VERSION_NAME: LAST_ERROR: NAME: igm-with-hc-gtz3 ZONE: europe-west1-b STATUS: RUNNING HEALTH_STATE: HEALTHY ACTION: NONE INSTANCE_TEMPLATE: my-template VERSION_NAME: LAST_ERROR:
Replace the following:
- MIG_NAME: The name of the MIG.
- ZONE: The zone where the MIG is located. For a regional MIG, use- --region REGION.
REST
For a regional MIG, construct a POST request to the
listManagedInstances
method:
POST https://compute.googleapis.com/compute/v1/projects/project-id/regions/region/instanceGroupManagers/MIG_NAME/listManagedInstances
For a zonal MIG, use the zonal MIG
listManagedInstances
method:
POST https://compute.googleapis.com/compute/v1/projects/project-id/zones/zone/instanceGroupManagers/MIG_NAME/listManagedInstances
The request returns a response similar to the following, which
includes an instanceHealth field for each managed instance.
{
  "managedInstances": [
    {
      "instance": "https://www.googleapis.com/compute/v1/projects/sproject-id/zones/zone/instances/igm-with-hc-fvz6",
      "instanceStatus": "RUNNING",
      "currentAction": "NONE",
      "id": "6159431761228150698",
      "version": {
        "instanceTemplate": "https://www.googleapis.com/compute/v1/projects/project-id/global/instanceTemplates/my-template"
      },
      "instanceHealth": [
        {
          "healthCheck": "https://www.googleapis.com/compute/v1/projects/project-id/global/healthChecks/example-check-01",
          "detailedHealthState": "HEALTHY"
        }
      ],
      "name": "igm-with-hc-fvz6"
    },
    {
      "instance": "https://www.googleapis.com/compute/v1/projects/project-id/zones/zone/instances/igm-with-hc-gtz3",
      "instanceStatus": "RUNNING",
      "currentAction": "NONE",
      "id": "6622324799312181783",
      "version": {
        "instanceTemplate": "https://www.googleapis.com/compute/v1/projects/project-id/global/instanceTemplates/my-template"
      },
      "instanceHealth": [
        {
          "healthCheck": "https://www.googleapis.com/compute/v1/projects/project-id/global/healthChecks/example-check-01",
          "detailedHealthState": "HEALTHY"
        }
      ],
      "name": "igm-with-hc-gtz3"
    }
  ]
}Health states
The following VM health states are available:
- HEALTHY: The VM is reachable, a connection to the application health checking endpoint can be established, and the response conforms to the requirements defined by the health check.
- DRAINING: The VM is being drained. Existing connections to the VM have time to complete, but new connections are being refused.
- UNHEALTHY: The VM is reachable, but does not conform to the requirements defined by the health check.
- TIMEOUT: The VM is unreachable, a connection to the application health checking endpoint cannot be established, or the server on a VM does not respond within the specified timeout. For example, this may be caused by misconfigured firewall rules or an overloaded server application on a VM.
- UNKNOWN: The health checking system is not aware of the VM or its health is not known at the moment. It can take 10 minutes for monitoring to begin on new VMs in a MIG.
New VMs return an UNHEALTHY state until they are verified by the
health checking system.
Whether a VM is repaired depends on its health state:
- If a VM has a health state of UNHEALTHYorTIMEOUT, and it has passed its initialization period, then the MIG immediately attempts to repair it.
- If a VM has a health state of UNKNOWN, then the MIG doesn't repair it immediately. This is to prevent an unnecessary repair of a VM for which the health checking signal is temporarily unavailable.
Autohealing attempts can be delayed if:
- A VM remains unhealthy after multiple consecutive repairs.
- A significant overall share of unhealthy VMs exists in the group.
We want to learn about your use cases, challenges, or feedback about VM health state values. You can share your feedback with our team at mig-discuss@google.com.
Check current actions on VMs
When a MIG is in the process of creating a VM instance, the MIG sets
that instance's read-only currentAction field to CREATING. If an autohealing
policy is attached to the group, after the VM is created and running, the MIG
sets the instance's current action to VERIFYING and the health checker
begins to probe the VM's application. If the application passes this initial
health check within the time that it takes for the application to start, then
the VM is verified and the MIG changes the VM's currentAction field to NONE.
To check the current actions on VMs, see View current actions on VMs.
Check whether the MIG is stable
At the group level, Compute Engine populates a read-only field called
status 
that contains an isStable flag.
If all VMs in the group are running and healthy (that is, the
currentAction 
field for each managed instance is set to NONE), then the MIG sets the
status.isStable field to true. Remember that the stability of a MIG depends
on group configurations beyond the autohealing policy; for example, if your
group is autoscaled, and if it is being scaled in or out, then the MIG sets
the status.isStable field to false due to the autoscaler operation.
To check the values of your MIG's status.isStable field, see
Check whether a MIG is stable.
View historical autohealing operations
You can use the gcloud CLI or the REST to view past autohealing events.
gcloud
Use the gcloud compute operations list 
command with a
filter 
to see only the autohealing repair events in your project.
gcloud compute operations list --filter='operationType~compute.instances.repair.*'
For more information about a specific repair operation, use the
describe 
command. For example:
gcloud compute operations describe repair-1539070348818-577c6bd6cf650-9752b3f3-1d6945e5 --zone us-east1-b
REST
For regional MIGs, submit a GET request to the
regionOperations 
resource and include a filter to scope the output list to
compute.instances.repair.* events.
GET https://compute.googleapis.com/compute/v1/projects/project-id/region/region/operations?filter=operationType+%3D+%22compute.instances.repair.*%22
For zonal MIGs, use the
zoneOperations 
resource.
GET https://compute.googleapis.com/compute/v1/projects/project-id/zones/zone/operations?filter=operationType+%3D+%22compute.instances.repair.*%22
For more information about a specific repair operation, submit a GET
request for that specific operation. For example:
GET https://compute.googleapis.com/compute/v1/projects/project-id/zones/zone/operations/repair-1539070348818-577c6bd6cf650-9752b3f3-1d6945e5
What makes a good autohealing health check
Health checks used for autohealing should be conservative so they don't preemptively delete and recreate your instances. When an autohealer health check is too aggressive, the autohealer might mistake busy instances for failed instances and unnecessarily restart them, reducing availability.
- unhealthy-threshold. Should be more than- 1. Ideally, set this value to- 3or more. This protects against rare failures like a network packet loss.
- healthy-threshold. A value of- 2is sufficient for most apps.
- timeout. Set this time value to a generous amount (five times or more than the expected response time). This protects against unexpected delays like busy instances or a slow network connection.
- check-interval. This value should be between 1 second and two times the timeout (not too long nor too short). When a value is too long, a failed instance is not caught soon enough. When a value is too short, the instances and the network can become measurably busy, given the high number of health check probes being sent every second.
What's next
- Try the tutorial, Using autohealing for highly available apps.
- Monitor VM health state changes.
- Apply configuration updates during repairs.
- Turn on repairs or autohealing, if you've turned off autohealing.