Google Cloud Observability storage overview

This document describes how Google Cloud Observability stores your telemetry data. It includes information about how Cloud Logging, Cloud Monitoring, and Cloud Trace store data. This document also provides a conceptual overview of observability buckets.

Log data

Log data resides in log buckets, which are the containers that Logging uses to store your log data. Every Google Cloud project, billing account, folder, and organization contains log buckets named _Required and _Default.

By default, log data resides in the Google Cloud project, billing account, folder, or organization where the data originates. However, you can configure Logging to route log data from the resource where it originates to another location, like another project or a centralized log bucket. To learn more, see Store log entries and Route log entries.

Cloud Logging lets you regionalize your log data:

  • Organization policies can restrict the locations of new log buckets and require that log buckets use custom managed encryption keys (CMEK).
  • For organizations and folders, default resource settings for Cloud Logging let you configure the following:

    • The location of new _Required and _Default log buckets.
    • The KMS key to encrypt your log data.
    • The configuration of the default sink.

    Descendants in the resource hierarchy automatically inherit these settings, unless they also configure default resource settings. For example, suppose that you configure default resource settings for Cloud Logging for an organization. Then all folders and projects in the organization's resource hierarchy automatically inherit those settings. However, if you set the default resource settings for Cloud Logging for a folder in that organization, then the folder-level settings are used.

    The default resource settings for Cloud Logging apply only to new resources, not existing resources. To learn more, see Configure default resource settings for Cloud Logging.

Metric data

Metric data resides in the Google Cloud project where the data originates.

For information about the storage policies for data that is used by Monitoring, see Data residency for Monitoring.

Trace data

Trace data resides in an observability bucket named in the same Google Cloud project where the data originates. The bucket name is _Trace. The next section describes observability buckets. To learn about the storage format of individual spans, see Trace schema.

You can control where your trace data is stored and who manages encryption keys by configuring default settings for observability buckets. For example, you can use these settings for an organization to require that new observability buckets in the organization use CMEK. To learn more, see the Data residency for observability buckets section of this page.

Observability storage model

The Observability API storage model relies on the following architecture:

Observability buckets
An observability bucket is the management entity for datasets, which store data. An observability bucket is in a specific location and has a data retention policy. When a Google Cloud service uses the Observability API to store their data, the system creates an observability bucket based on the name of the service. For example, for the Cloud Trace service, the system-created bucket is named _Trace. To learn about the structure of an observability bucket, see Bucket.
Datasets
A dataset is a storage entity. Each dataset is a child of an observability bucket. When the system creates an observability bucket for a Google Cloud service, it also creates one dataset. For example, after the system creates the _Trace bucket, it creates the dataset named Spans. That dataset stores your trace data. To learn about the structure of a dataset, see Dataset.
Views on datasets
Each dataset hosts one or more view. A view provides read access to a subset of entries in the dataset. When a dataset is created, the system automatically creates one view. That view includes all data in the dataset. The name of the view depends on the service. For example, for the Cloud Trace service, the system creates a view named _AllSpans on the Spans dataset. To learn about the structure of a view, see View.
Links on datasets

Each dataset can contain at most one link. When you create a link for a dataset, the system creates a linked BigQuery dataset. You can then query the data in your dataset by using BigQuery or by other services that use the BigQuery API. To learn about the structure of a link, see Link.

The system doesn't automatically create links on datasets.

For example, your trace data resides in a dataset named Spans. This dataset is a child of the observability bucket named _Trace. On the Spans dataset, the system creates the view named _AllSpans. This view includes all data in the dataset.

Data residency for observability buckets

If you have compliance or regulatory requirements to store your data in specific locations or to use CMEK, then we recommend that you configure default settings for observability buckets:

  • For organizations, folders, and projects, default settings for observability buckets let you configure the following:

    • A default storage location.
    • For each location, a default Cloud Key Management Service key.

    Descendants in the resource hierarchy automatically use these settings, except for those descendants where you've configured default settings.

    The default settings for observability buckets apply only to new resources, not to existing resources. To learn more, see Set defaults for observability buckets.

The default settings for observability buckets settings don't apply to log buckets, which store log data. To learn how to set the default location or require CMEK for log buckets, see Configure default resource settings for Cloud Logging.

Limitations

You can't do the following:

  • Modify or delete observability buckets.
  • Create, delete, or modify datasets.
  • Create, delete, or modify views.
  • Use the Google Cloud console to list buckets, datasets, views, or links.

What's next