Configure alerts for scheduled snapshots

Linux Windows

You can create a custom metric to raise alerts or provide information to troubleshoot problems with scheduled snapshots.

For example, to set up an alert for scheduled snapshot failures, use the following procedure:

Create a custom query to capture scheduled snapshot events.
Create a metric based off of the query that counts scheduled snapshot failures.
Create an alert policy to send an alert when there is a scheduled snapshot failure.

Before you begin

If you haven't already, set up authentication. Authentication verifies your identity for access to Google Cloud services and APIs. To run code or samples from a local development environment, you can authenticate to Compute Engine by selecting one of the following options:
Select the tab for how you plan to use the samples on this page:
Console

When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.
gcloud
1. Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command:
  gcloud init
  If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
  
  Note: If you installed the gcloud CLI previously, make sure you have the latest version by running gcloud components update.
2. Set a default region and zone.
REST

To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.

Required roles and permissions

To get the permissions that you need to create a snapshot schedule, ask your administrator to grant you the following IAM roles on the project:

Compute Instance Admin (v1) (roles/compute.instanceAdmin.v1)
To connect to a VM that can run as a service account: Service Account User (v1) (roles/iam.serviceAccountUser)

For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

Create a custom query

To capture scheduled snapshot events, create a custom query in Logs explorer.

In the Google Cloud console, go to the Logging > Logs Explorer page.

Go to the Logs Explorer page
If the query editor isn't visible at the top of the page, click the Show query toggle.

Enter the following text in the query editor, replacing PROJECT_ID with your project ID:

resource.type="gce_disk"
logName="projects/PROJECT_ID/logs/cloudaudit.googleapis.com%2Fsystem_event"
protoPayload.methodName="ScheduledSnapshots"
severity>"INFO"

Click Run query.

Create a metric

After you create the custom query, create a metric that counts scheduled snapshot failures.

At the top of the results table on the Logs Explorer page, click the Actions drop-down.
Select Create metric.
In the Create log-based metric window, provide the following details:
- Metric type: Counter
- Log-based metric name: scheduled_snapshot_failure_count
- Description: count of scheduled snapshot failures
The Filter selection section is automatically populated with the query from the previous step.
Under Labels, click Add label and enter the following:
- Label name: status
- Description: status of scheduled snapshot request
- Label type: STRING
- Field name: protoPayload.response.status
Click Done.
Click Create Metric.

Create an alert policy

After you create the metric, create an alert policy to send an alert when there's a scheduled snapshot failure.

In the Google Cloud console, go to the Cloud Logging > Log-based metrics page.

Go to the Log-based metrics page
In the User-defined Metrics section, find your new metric named scheduled_snapshot_failure_count.
Click the More menu button in this row and select Create alert from metric.

The Create alerting policy page opens.
In the New condition tab, configure your alert signal:
Set the Rolling window to 5 minutes or your preferred interval.
For Rolling window function, select Sum.

Click Next.
In the Configure trigger tab, enter the following:
1. Condition type: Threshold
2. Alert trigger: Any time series violates
3. Threshold position: Above threshold
4. Threshold value: 0
  
  Setting Threshold value to 0 triggers an alert if any snapshot failure occurs. You can modify this value as your workload requires.
5. Condition name: Snapshot failure threshold exceeded
Click Next.
In the Notifications and name tab, set your Alert policy name. Optionally, you can add notification channels and documentation for this policy.

Click Next.
Review your alert.
Click Create Policy.

To learn more about creating alert policies, see Create metric-threshold alerting policies.

What's next

Learn about snapshot schedule frequencies, retention policies, and naming rules in About snapshot schedules for disks.
Learn about disk snapshots.
Learn how to create scheduled snapshots for disks.
Learn how to view logs.
Learn more about alerting

Configure alerts for scheduled snapshots Stay organized with collections Save and categorize content based on your preferences.

Before you begin

Console

gcloud

REST

Required roles and permissions

Create a custom query

Create a metric

Create an alert policy

What's next

Configure alerts for scheduled snapshots