Manage your AlloyDB resources using Dataplex Universal Catalog

You can search for and manage your AlloyDB for PostgreSQL resources using Dataplex Universal Catalog, which is a platform for storing, managing, and accessing your metadata. You can use Dataplex Universal Catalog to discover and understand your AlloyDB metadata. Use Dataplex Universal Catalog to help you with the following:

  • Analysis, including dependencies and suitability for a use case
  • Change management
  • Schema evolution

Dataplex Universal Catalog automatically retrieves the following metadata from AlloyDB:

Resource type Metadata retrieved
Clusters, databases, and instances
  • Name
  • Location (region)
Schema, columns, tables, views
  • Name
  • Location (region)
  • Description
  • Schema

Types of metadata discovery

Metadata discovery, in the context of Dataplex Universal Catalog, is the automated process where Dataplex Universal Catalog scans connected data sources—like AlloyDB—to identify data assets (like tables, views, and databases) and extract their technical metadata (like schemas, descriptions, and locations) into its catalog.

This process runs periodically to keep the catalog synchronized with the source systems. For AlloyDB, cluster and instance metadata is discovered by default, while database, schema, table, and view metadata is discovered only when you enable the integration on the cluster.

Dataplex Universal Catalog supports keyword and natural language searches.

  • Keyword search lets you find resources using specific keywords, filters, and a defined syntax. For example, you might enter system=AlloyDB AND type=Database to view all AlloyDB databases.
  • Natural language search (Preview) uses AI to understand semantic queries. It lets you find resources using everyday language, eliminating the need for complex syntax. For example, you can enter queries like List all AlloyDB tables related to sales.

For more information, see Search syntax for Dataplex Universal Catalog.

Example: Discovering an AlloyDB database schema

To understand the metadata discovery process, consider the example of an AlloyDB cluster named user-data-cluster with Dataplex Universal Catalog integration enabled. Inside this cluster's primary instance, you have a database named profiles which contains a table customers (id, email, signup_date) and a view active_users (id, email).

AlloyDB is synced periodically to Dataplex Universal Catalog, the following resources are available:

  • Clusters
  • Columns
  • Databases
  • Instances
  • Tables
  • Views

After discovery is complete, you can search for these assets—customers table and active_users—in the Dataplex Universal Catalog page of the Google Cloud console or using the API, view their schemas, and enrich them with additional business or operational metadata.

Enrich metadata using aspects

Aspect types are reusable resources that you can use as templates for aspects. Aspect types help you avoid duplication of work and incomplete aspects. You can use Dataplex Universal Catalog to create the aspect types that you need.

After you create custom aspect types, you can attach aspects to your AlloyDB assets. Attaching aspects to AlloyDB assets lets you do the following:

  • Add business metadata to the assets
  • Search for assets by business metadata and other custom metadata

To learn more about creating aspect types and attaching aspects to AlloyDB, see Manage aspects and enrich metadata.

Metadata discovery workflow

By default, cluster and instance resources are ingested and discoverable. You only need to enable the Dataplex Universal Catalog integration for your AlloyDB cluster if you want to ingest and discover database, schema, table and view resources.

  1. To discover AlloyDB database, schema, table and view resources, you must enable the Dataplex Universal Catalog integration for your AlloyDB cluster using the AlloyDB API when you create a new cluster or when you update an existing cluster. For more information, see Create a cluster and its primary instance.

  2. Search for your AlloyDB assets using the Dataplex Universal Catalog search page in the Google Cloud console. For more information, see Search for resources in Dataplex Universal Catalog.

  3. To add more context to your AlloyDB assets, you can attach aspects to your assets. Aspects are custom metadata entries that help with data governance and analysis. For example, you can use an aspect to label a column that contains personally identifiable information (PII). For more information, see Manage aspects and enrich metadata.

  4. You can now navigate through your AlloyDB resources following this hierarchy: Cluster > Database > Schema > Table or View > Column. This provides a detailed view of your data landscape. For more information, see Search for resources in Dataplex Universal Catalog.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project.

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

    Go to project selector

  5. Verify that billing is enabled for your Google Cloud project.

  6. Enable the Dataplex API for the project.

    Enable the Dataplex API

  7. Check permissions.

    You need certain Identity and Access Management (IAM) roles and permissions to search for and attach entries to AlloyDB assets. For details, see Required IAM roles and permissions for Dataplex Universal Catalog.

Required IAM roles

This section describes the roles and permissions required to search for resources and to access the search results.

Required roles for accessing search results

To search for and view AlloyDB metadata in Dataplex Universal Catalog, principals must have permissions to view AlloyDB resources, including the dataplex.projects.search permission.

To grant principals—such as users, groups, or service accounts—these permissions, assign them the AlloyDB Viewer (roles/alloydb.viewer) IAM role on the project that contains the AlloyDB resources.

Dataplex Universal Catalog operation AlloyDB resource Roles or permissions required
Search for AlloyDB resources Cluster alloydb.clusters.get
Columns alloydb.databases.get
Instance alloydb.instances.get
Database alloydb.databases.get
Tables alloydb.databases.get
Views alloydb.databases.get

If you don't want to use the predefined AlloyDB Viewer role, then you can assign a custom role that includes alloydb.clusters.get, alloydb.instances.get, and alloydb.databases.get permissions.

For more information about granting roles, see Manage access. For more information about AlloyDB IAM roles, see IAM roles and permissions for AlloyDB.

Required roles for searching entries

To search for entries, you need at least one of the following IAM roles on the project that is used for search:

Permissions on search results are checked independently of the selected project. For more information, see Search for data assets with Dataplex Universal Catalog.

Enable the integration of Dataplex Universal Catalog on your AlloyDB cluster

By default, cluster and instance resources are ingested and discoverable. You only need to enable the Dataplex Universal Catalog integration for your AlloyDB cluster if you want to ingest and discover database, schema, table and view resources.

To ingest and make your database, schema, table and view resources discoverable, create or update the cluster using the AlloyDB API. Turning on this integration enables the syncing of data plane metadata database, schema, table and view resources.

To integrate with Dataplex Universal Catalog, you use the cluster-level dataplexConfig flag. To view data for secondary clusters, you must enable this flag separately. To activate the export of AlloyDB metadata—such as databases, schemas, and tables—to Cloud Storage buckets, set enabled to true. This data is then available for use by Dataplex Universal Catalog. This feature is only applicable to primary instances.

Enable Dataplex Universal Catalog integration when you create a cluster

To create an AlloyDB cluster with Dataplex Universal Catalog integration enabled, use the POST method and the following URL. For a complete list of parameters for this call, see Method: projects.locations.clusters.create. For information about cluster settings, see View cluster and instance settings.

Don't include sensitive or personally identifiable information in your cluster ID because it's externally visible. You don't need to include the project ID in the cluster name. This is done automatically where appropriate, for example, in the log files.

To send your request, save the request body in a file named cluster_request.json:

{
  "databaseVersion": "DATABASE_VERSION",
  "initialUser": {
    "user": "INITIAL_USERNAME",
    "password": "INITIAL_USER_PASSWORD"
  },
  "labels": {
    "test": "alloydb-cluster-full"
  },
  "dataplexConfig" : {
    "enabled" : true
  }
}

Make the following replacements:

  • DATABASE_VERSION: enum string of the database version—for example, POSTGRES_16.
  • INITIAL_USERNAME: the username that you use for the default postgres user.
  • INITIAL_USER_PASSWORD: the password that you use for the default postgres user.

Use the following HTTP method and URL:

POST https://alloydb.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/clusters?cluster_id=CLUSTER_ID

Make the following replacements:

  • PROJECT_ID: the ID or project number of the Google Cloud project that contains the cluster.
  • LOCATION_ID: the ID of the cluster's region.
  • CLUSTER_ID: the ID of the cluster that you create. The cluster ID must begin with a lowercase letter and can contain lowercase letters, numbers, and hyphens.

To send your request, use one of the following options:

curl (Linux, macOS, or Cloud Shell)

Save the request body in a file named request.json and execute the following command:

    curl -X POST \
         -H "Authorization: Bearer $(gcloud auth print-access-token)" \
         -H "Content-Type: application/json; charset=utf-8" \
         -d @request.json \
        "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/clusters?cluster_id=CLUSTER_ID"
      

PowerShell (Windows)

Save the request body in a file named request.json and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/clusters?cluster_id=CLUSTER_ID"| Select-Object -Expand Content

You receive a JSON response similar to the following:

Response

{
  "kind": "alloydb#operation",
  "targetLink": "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/clusters/CLUSTER_ID",
  "status": "PENDING",
  "user": "user@example.com",
  "insertTime": "2024-09-25T22:19:33.735Z",
  "operationType": "CREATE",
  "name": "OPERATION_ID",
  "targetId": "CLUSTER_ID",
  "selfLink": "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/operations/OPERATION_ID",
  "targetProject": "PROJECT_ID"
}

Enable the Dataplex Universal Catalog integration for your AlloyDB cluster

To update specific fields of an existing cluster, use the PATCH method on the projects.locations.clusters resource.

Use the following HTTP method and URL:

PATCH https://alloydb.googleapis.com/v1/projects/PROJECT_ID/clusters/CLUSTER_ID

Make the following replacements:

  • PROJECT_ID: the ID or project number of the Google Cloud project that contains the cluster.
  • CLUSTER_ID: the name of the cluster.

Request JSON body:

{
  "dataplexConfig": {
    "enabled": true
  }
}

To send your request, use one of the following options:

curl (Linux, macOS, or Cloud Shell)

Save the request body in a file named request.json and execute the following command:

    curl -X PATCH \
         -H "Authorization: Bearer $(gcloud auth print-access-token)" \
         -H "Content-Type: application/json; charset=utf-8" \
         -d @request.json \
        "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/clusters/CLUSTER_ID"
      

PowerShell (Windows)

Save the request body in a file named request.json and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method PATCH `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/clusters/CLUSTER_ID"| Select-Object -Expand Content

You receive a JSON response similar to the following:

Response

{
  "kind": "alloydb#operation",
  "targetLink": "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/clusters/CLUSTER_ID",
  "status": "PENDING",
  "user": "user@example.com",
  "insertTime": "2024-09-25T22:19:33.735Z",
  "operationType": "UPDATE",
  "name": "OPERATION_ID",
  "targetId": "CLUSTER_ID",
  "selfLink": "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/operations/OPERATION_ID",
  "targetProject": "PROJECT_ID"
}

Verify Dataplex Universal Catalog integration on your AlloyDB cluster

To verify that the Dataplex Universal Catalog integration is enabled on an existing cluster, use the GET method.

Use the following HTTP method and URL:

GET https://alloydb.googleapis.com/v1/projects/PROJECT_ID/clusters/CLUSTER_ID

Make the following replacements:

  • PROJECT_ID: the ID or project number of the Google Cloud project that contains the cluster.
  • CLUSTER_ID: the cluster ID.

To send your request, use one of the following options:

curl (Linux, macOS, or Cloud Shell)

Save the request body in a file named request.json and execute the following command:

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/clusters/CLUSTER_ID"
      

PowerShell (Windows)

Save the request body in a file named request.json and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/clusters/CLUSTER_ID"| Select-Object -Expand Content

You receive a JSON response for the entire Cluster proto. Look for the dataplexConfig message inside the proto:

{
"dataplexConfig":
  {
  "enabled": true
  }
}

Deactivate the integration of Dataplex Universal Catalog on your AlloyDB cluster

Use the following example to deactivate the integration. When you disable this feature, only the cluster and instance metadata is available.

Use the following HTTP method and URL:

PATCH https://alloydb.googleapis.com/v1/projects/PROJECT_ID/clusters/CLUSTER_ID

Make the following replacements:

  • PROJECT_ID: the ID or project number of the Google Cloud project that contains the cluster.
  • CLUSTER_ID: the cluster ID.

Request JSON body:

{
  "dataplexConfig": {
    "enabled": false
  }
}

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Save the request body in a file named request.json and execute the following command:

curl -X PATCH \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/clusters/CLUSTER_ID"
      

PowerShell (Windows)

Save the request body in a file named request.json and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method PATCH `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/clusters/CLUSTER_ID"| Select-Object -Expand Content

You receive a JSON response similar to the following:

Response

{
  "kind": "alloydb#operation",
  "targetLink": "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/clusters/CLUSTER_ID",
  "status": "PENDING",
  "user": "user@example.com",
  "insertTime": "2024-01-16T02:32:12.281Z",
  "operationType": "UPDATE",
  "name": "OPERATION_ID",
  "targetId": "CLUSTER_ID",
  "selfLink": "https://alloydb.googleapis.com/v1/projects/PROJECT_ID/operations/OPERATION_ID",
  "targetProject": "PROJECT_ID"
}

Enrich AlloyDB assets with aspects

Aspect types are reusable resources that you can use as templates for aspects. Aspect types help you avoid duplication of work and incomplete aspects. You can use Dataplex Universal Catalog to create the aspect types that you need.

After you create custom aspect types, you can attach aspects to your AlloyDB assets. Attaching aspects to AlloyDB assets lets you do the following:

  • Add business metadata to the assets
  • Search for assets by business metadata and other custom metadata

To learn more about creating aspect types and attaching aspects to AlloyDB, see Manage aspects and enrich metadata.

Search for AlloyDB assets

Use the Dataplex Universal Catalog search page in the Google Cloud console to search for AlloyDB assets.

  1. Go to the Dataplex Universal Catalog Search page.

    Go to Search

  2. In Choose search platform, select Dataplex Universal Catalog.

  3. In the Filters panel, click Systems, and then select AlloyDB.

  4. Optional. In Type aliases, you can filter the search results to a specific type of AlloyDB asset by the selecting one or more of the following type alias:

    • Database
    • Database schema
    • Cluster
    • Instance
    • Service
    • Table
    • View

You can use the search field in Dataplex Universal Catalog to perform keyword search queries. For example, you might enter system=AlloyDB AND type=Database to view all AlloyDB databases.

For more information, see Search syntax for Dataplex Universal Catalog.

To view all AlloyDB assets, enter system=AlloyDB.

Then you can enter specific keywords. For example, to view all AlloyDB databases:

system=AlloyDB AND type=Database

To view all AlloyDB tables, enter the following query:

system=AlloyDB AND type=Table

You can also use parentheses and the logical operators AND and OR for complex expressions. To learn more about the expressions that you can use in the search field, see Search syntax for Dataplex Universal Catalog.

You can directly enter search queries for specific AlloyDB assets into the search field. The format of the query string is as follows:

type="projects/dataplex-types/locations/global/entryTypes/QUERY_STRING"

Replace the following:

  • QUERY_STRING: use the following list to identify a query string based on the type of AlloyDB asset that you want to query:

    • alloydb-database
    • alloydb-cluster
    • alloydb-instance
    • alloydb-table
    • alloydb-schema
    • alloydb-view

An example query might look like the following:

type="projects/1234567890/locations/global/entryTypes/alloydb-cluster"

Search by aspect type

Dataplex Universal Catalog includes a few built-in aspect types that you can use to perform searches.

To search by aspect type, follow these steps:

  1. In the Aspects panel, click the Add more aspect types menu.
  2. Optional. Select AlloyDB Access to filter the results by AlloyDB dialect.
  3. Select one or more of the following aspect types to limit the search results to that type.
    • AlloyDB Database
    • AlloyDB Cluster
    • AlloyDB Instance
    • AlloyDB View
    • AlloyDB Schema
    • AlloyDB Table
  4. Click OK.
  5. In the results table, click the name of the asset to view the metadata for that asset.
  6. Optional: Enhance or view your assets. You can do any of the following:
    • To add a rich text description of the asset, in Overview, click Add .
    • To attach an aspect to the asset, in Aspects, click Add .
    • To view member databases for an instance, click the Entry List tab, and then click Show all children entries in search. If the Entry List tab doesn't appear, then the cluster has no databases.
    • In Entry details, view the full details of the asset. Click the entry name to drill down to additional entries.

Natural Language search in AlloyDB

Natural language search (Preview) uses AI to understand semantic queries. It lets you find resources using everyday language, eliminating the need for complex syntax. For example, you can enter queries like List all AlloyDB tables related to sales.

For more information, see Search syntax for Dataplex Universal Catalog.

Syntax search in AlloyDB

Keyword search lets you find resources using specific keywords, filters, and a defined syntax. For example, you might enter system=AlloyDB AND type=Database to view all AlloyDB databases.

For more information, see Search syntax for Dataplex Universal Catalog.

Example workflow - Drill down from cluster to columns

To drill down from cluster to columns, follow these steps:

  1. Go to the Dataplex Universal Catalog Search page.

    Go to Search

  2. For the Choose search platform option, select Dataplex Universal Catalog.

  3. In the Filters panel, select Systems and then AlloyDB. Alternatively, enter system=AlloyDB in the search field.

  4. Select a cluster name.

  5. On the AlloyDB details page, click the Entry list tab, and then click Show all children entries in search. Dataplex Universal Catalog displays the databases in the cluster.

  6. On the AlloyDB database details page, click the Entry list tab, and then click View child entries in search. Dataplex Universal Catalog displays the tables in the cluster.

  7. Select a table name, and then on the AlloyDB table details page, click Schema to view the table columns.

  8. Optional: To add an aspect type to a column, click the Add aspect button.

Pricing

There is no charge for storing AlloyDB technical metadata in Dataplex Universal Catalog. Standard Dataplex Universal Catalog pricing applies for API calls and additional business metadata enrichment. For more information, see the Dataplex Universal Catalog pricing page.

Resource usage

Dataplex Universal Catalog periodically extracts data from your clusters. The extraction process requires a certain amount of CPU usage. Clusters with smaller machine types—for example, machines that share larger schemas (10,000 or more tables)—can use a small amount of their CPU during the extraction process.

Limitations

  • A maximum of one million tables per database is allowed.
  • It can take up to 48 hours for updates to your database to be reflected in Dataplex Universal Catalog.
  • The extraction process typically takes a few minutes, but time to completion depends on the CPU size and the table count.
  • When an instance doesn't exist or is not running, it can take 7 days for metadata to be deleted.

What's next