You've probably asked questions like "What does this column name mean?", "Who owns this broken dataset?", or "Is this table approved for use?" Some data catalogs use unstructured tags to add this information, but tags quickly become outdated or inconsistent. Knowledge Catalog (formerly Dataplex Universal Catalog) avoids this issue by letting you attach structured, schema-driven metadata and clear business definitions directly to your data assets. This approach helps you build programmatic governance at scale.
This tutorial shows you how to get started with data governance in Knowledge Catalog. Designed for data engineers, database administrators, and data architects, this tutorial walks through manual UI steps to help you build a strong mental model before you automate these workflows. It clarifies the relationships between key Knowledge Catalog concepts. By the end, you'll know how to make your data discoverable and trustworthy.
Objectives
In this tutorial, you learn how to:
- Create a single source of truth for your business terms with a business glossary.
- Structure and organize your metadata with aspect types.
- Attach metadata to your assets with aspects.
- Use Knowledge Catalog Search to find exactly what you need using this new structured metadata.
Before you begin
Before you begin, do the following:
- Select a Google Cloud project for this tutorial.
- Confirm that billing is enabled for your project.
Set up your environment
This tutorial uses Cloud Shell, a command-line environment that runs in the cloud.
From the Google Cloud console, click Activate Cloud Shell in the top right toolbar. It takes a few moments to provision and connect to the environment.
In Cloud Shell, set your
PROJECT_IDandLOCATIONvariables so that all future commands target your specific Google Cloud project.export PROJECT_ID=$(gcloud config get-value project) gcloud config set project $PROJECT_ID export LOCATION="us-central1"Enable the necessary Google Cloud services.
gcloud services enable \ dataplex.googleapis.com \ bigquery.googleapis.com \ datacatalog.googleapis.com
Create a BigQuery dataset and prepare sample data
Use the following code to create a BigQuery dataset and load some sample CSV transactions into a table. After you create the table, Knowledge Catalog automatically discovers it and creates an entry for it in the catalog.
Think of an entry as Knowledge Catalog's representation of a data asset. It's like a record in the catalog that you can attach governance metadata to. Instead of governing the BigQuery table directly, you govern its entry in Knowledge Catalog.
# Create the BigQuery Dataset in the us-central1 region
bq --location=$LOCATION mk --dataset \
--description "Retail data for governance codelab" \
$PROJECT_ID:retail_data
# Create a temporary CSV file with the sample data
echo "transaction_id,user_email,gmv,transaction_date
1001,test@example.com,150.50,2025-08-28
1002,user@example.com,75.00,2025-08-28" > /tmp/transactions.csv
# Load the data from the temporary CSV file into a BigQuery table
bq load \
--source_format=CSV \
--autodetect \
retail_data.transactions \
/tmp/transactions.csv
# (Optional) Clean up the temporary file
rm /tmp/transactions.csv
Run a SELECT query to verify your setup:
bq query --nouse_legacy_sql "SELECT * FROM retail_data.transactions"
Example output:
+----------------+------------------+-------+------------------+
| transaction_id | user_email | gmv | transaction_date |
+----------------+------------------+-------+------------------+
| 1001 | test@example.com | 150.5 | 2025-08-28 |
| 1002 | user@example.com | 75.0 | 2025-08-28 |
+----------------+------------------+-------+------------------+
Establish common terms with a business glossary
Good governance relies on clear definitions. For example, a developer shouldn't have to guess if a column named gmv means Gross Merchandise Value or whether it includes taxes or returns. A business glossary solves this by creating a single source of truth that decouples business definitions from technical details. This ensures that terms like Gross Merchandise Value mean the same thing to everyone, from the Sales team to Finance.
Follow these steps to create a glossary and define your first term:
In the Google Cloud console, go to the Knowledge Catalog Glossaries page.
Click Create Business Glossary.
Enter the following details:
- Display name:
Retail Business Glossary - Location:
us-central1 (Iowa)
- Display name:
Click Create.
Click Create Category.
Name the category
Sales Metrics, and click Create.Select the Sales Metrics category and click Add term.
Name the term
Gross Merchandise Valueand click Create.Click the Gross Merchandise Value term to open its details page.
Click Add next to Overview. Enter the following details:
The total value of merchandise sold over a given period of time before the deduction of any fees or expenses. This is a key indicator of e-commerce business growth.Click Save.
You have now created a glossary term that you can link to data assets across your organization.
Define technical metadata with an aspect type
If you need to track who owns a particular data asset, key-value tags aren't enough. You don't want one table tagged owner:bob and another contact:alice@example.com. You want a structured schema that requires owner information to be in a valid email format.
To meet this need, Knowledge Catalog supports aspect types. An aspect type is like a blueprint for your metadata that lets you set clear rules and required fields. This ensures that any metadata you add later stays organized.
In the Google Cloud console, go to the Knowledge Catalog Aspect types tab on the Metadata types page.
On the Custom tab, click Create.
Enter the following details:
- Display name:
Data Asset Governance - Location:
us-central1 (Iowa)
- Display name:
In the Template section, click Add Field to create the following three fields:
Field 1:
- Display name:
Data Steward - Type:
Text - Is Required: Select the checkbox.
- Text type:
Plain text
- Display name:
Field 2 (click Add field):
- Display name:
Data Sensitivity - Type:
Enum - Is Required: Leave optional.
- Values: Add
Public,Internal, andConfidential
- Display name:
Field 3 (click Add a field):
- Display name:
Last Review Date - Is Required: Leave optional.
- Type:
Date and time
- Display name:
Click Save.
You now have an aspect type for governance-related metadata fields like data steward, sensitivity level, and review date. In the next section, you apply this schema to a table entry by attaching an aspect with specific values for these fields.
Enrich an entry with governance metadata
Column names are often abbreviated or ambiguous. Linking a column to a term in your business glossary provides a clear and consistent definition. In this step, you enrich the entry for the retail_data.transactions table by linking the Gross Merchandise Value term to a column named gmv and using your aspect type to attach an aspect to the table entry.
Link a column to a business term
To clarify what the gmv column in retail_data.transactions is, link it to your Gross Merchandise Value term.
In the Google Cloud console, go to the Knowledge Catalog Search page.
Click Filters to open the Filters panel.
For Scope, select Current Project.
Search for
retail_data.transactionsand click the returned transactions table.Click the Schema tab.
Select the checkbox next to the
gmvcolumn, and click Add business term.Select
Gross Merchandise Value.
Attach an aspect to the table entry
In addition to linking business terms to columns, you can attach an aspect to a table entry to capture table-level governance metadata, such as data ownership and sensitivity.
An aspect is an instance of an aspect type, containing specific values for metadata fields. When you attach an aspect to an entry, Knowledge Catalog checks the information you provide against the schema defined in the aspect type to ensure consistency.
To define ownership and sensitivity for the retail_data.transactions table, attach the Data Asset Governance aspect:
- On the Details tab of the
retail_data.transactionsentry page, click Add next to Optional aspects. - Select
Data Asset Governancefrom the list. Enter values in the fields:
- Data Steward:
finance-team@example.com - Data Sensitivity: Select Internal.
- Last Review Date: Select today's date.
- Data Steward:
Click Save.
You've now set up a solid foundation for data governance in Knowledge Catalog.
Search for entries using enriched metadata
You've enriched the retail_data.transactions entry by linking a column to a business term and attaching an aspect. Now you can use Knowledge Catalog Search to find entries based on these business contexts. For example, you can find all assets with a specific sensitivity level, or search for your glossary term to discover the underlying tables.
In the Google Cloud console, go to the Knowledge Catalog Search page.
Click Filters to open the Filters panel.
For Scope, select Current Project.
In the search bar, enter
Find tables where the Data Asset Governance aspect has Internal sensitivity.You should see your
retail_data.transactionstable in the list of results.Clear the search bar and enter
Find tables with the Gross Merchandise Value term attached.You should again see the
retail_data.transactionstable in the results, as itsgmvcolumn is directly linked to this business term.
Clean up
To avoid incurring charges, delete the resources that you created in this tutorial.
Delete the sample dataset
To delete the sample BigQuery dataset and all its tables, use the following command. This action is irreversible.
# Re-run these exports if your Cloud Shell session timed out
export PROJECT_ID=$(gcloud config get-value project)
# Manually type this command to confirm you are deleting the correct dataset
bq rm -r -f --dataset $PROJECT_ID:retail_data
Delete Knowledge Catalog artifacts
In the Google Cloud console, go to the Knowledge Catalog Aspect types tab on the Metadata types page.
Select the
data_asset_governanceaspect type and click Delete.In the Google Cloud console, go to the Knowledge Catalog Glossaries page.
Select the
Gross Merchandise Valueterm and click Delete.Select the
Sales Metricscategory and click Delete.Select the
Retail Business Glossaryand click Delete.
What's next
- Manage business glossaries: Learn more about establishing a standardized vocabulary for your data in Manage a business glossary.
- Enrich metadata context: Learn more about adding meaningful context using aspects in Manage aspects and enrich metadata.
- Automate aspect attachment: Attach aspects to new datasets with Cloud Run functions or Cloud Build.
- Governance as code: Manage schemas in version control using the Google Cloud Terraform provider.