You can create a custom metric to raise alerts or provide information to troubleshoot problems with scheduled snapshots.
For example, to set up an alert for scheduled snapshot failures, use the following procedure:
- Create a custom query to capture scheduled snapshot events.
- Create a metric based off of the query that counts scheduled snapshot failures.
- Create an alert policy to send an alert when there is a scheduled snapshot failure.
Before you begin
-
If you haven't already, set up authentication.
Authentication verifies your identity for access to Google Cloud services and APIs. To run
code or samples from a local development environment, you can authenticate to
Compute Engine by selecting one of the following options:
Select the tab for how you plan to use the samples on this page:
Console
When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.
gcloud
-
Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command:
gcloud initIf you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
- Set a default region and zone.
REST
To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command:
gcloud initIf you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.
-
Required roles and permissions
To get the permissions that you need to create a snapshot schedule, ask your administrator to grant you the following IAM roles on the project:
-
Compute Instance Admin (v1) (
roles/compute.instanceAdmin.v1) -
To connect to a VM that can run as a service account:
Service Account User (v1) (
roles/iam.serviceAccountUser)
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
Create a custom query
To capture scheduled snapshot events, create a custom query in Logs explorer.
In the Google Cloud console, go to the Logging > Logs Explorer page.
If the query editor isn't visible at the top of the page, click the Show query toggle.
Enter the following text in the query editor, replacing
PROJECT_IDwith your project ID:resource.type="gce_disk" logName="projects/PROJECT_ID/logs/cloudaudit.googleapis.com%2Fsystem_event" protoPayload.methodName="ScheduledSnapshots" severity>"INFO"Click Run query.
Create a metric
After you create the custom query, create a metric that counts scheduled snapshot failures.
- At the top of the results table on the Logs Explorer page, click the Actions drop-down.
- Select Create metric.
In the Create log-based metric window, provide the following details:
- Metric type:
Counter - Log-based metric name:
scheduled_snapshot_failure_count - Description:
count of scheduled snapshot failures
The Filter selection section is automatically populated with the query from the previous step.
- Metric type:
Under Labels, click Add label and enter the following:
- Label name:
status - Description:
status of scheduled snapshot request - Label type:
STRING - Field name:
protoPayload.response.status
- Label name:
Click Done.
Click Create Metric.
Create an alert policy
After you create the metric, create an alert policy to send an alert when there's a scheduled snapshot failure.
In the Google Cloud console, go to the Cloud Logging > Log-based metrics page.
In the User-defined Metrics section, find your new metric named
scheduled_snapshot_failure_count.Click the More menu button in this row and select Create alert from metric.
The Create alerting policy page opens.
In the New condition tab, configure your alert signal:
Set the Rolling window to
5 minutesor your preferred interval.For Rolling window function, select
Sum.Click Next.
In the Configure trigger tab, enter the following:
- Condition type:
Threshold - Alert trigger:
Any time series violates - Threshold position:
Above threshold Threshold value:
0Setting Threshold value to
0triggers an alert if any snapshot failure occurs. You can modify this value as your workload requires.Condition name:
Snapshot failure threshold exceeded
Click Next.
- Condition type:
In the Notifications and name tab, set your Alert policy name. Optionally, you can add notification channels and documentation for this policy.
Click Next.
Review your alert.
Click Create Policy.
To learn more about creating alert policies, see Create metric-threshold alerting policies.
What's next
- Learn about snapshot schedule frequencies, retention policies, and naming rules in About snapshot schedules for disks.
- Learn about disk snapshots.
- Learn how to create scheduled snapshots for disks.
- Learn how to view logs.
- Learn more about alerting