When working with data, you've probably asked questions like "What does this column name mean?", "Who owns this broken dataset?", or "Is this table approved for use?" Metadata tags try to answer these questions, but they quickly become outdated or inconsistent. Knowledge Catalog (formerly Dataplex Universal Catalog) solves this by letting you attach structured metadata and clear business definitions directly to data assets. Providing clear data context grounds AI agents and builds a foundation of trust for every user who interacts with the data.
This tutorial shows you how to establish data context in Knowledge Catalog. Designed for users such as data stewards and business analysts, this tutorial walks you through UI-based steps to build standard business terms and context before you automate these workflows. The tutorial clarifies relationships between key Knowledge Catalog concepts. By the end, you'll know how to make your data discoverable and trustworthy.
Objectives
In this tutorial, you learn how to:
- Create a single source of truth for business terms with a business glossary.
- Structure and organize metadata with aspect types.
- Attach metadata to data assets with aspects.
- Use Knowledge Catalog Search to find exactly what you need using this new structured metadata.
Before you begin
Before you begin, do the following:
- Select a Google Cloud project for this tutorial.
- Confirm that billing is enabled for your project.
Set up your environment
This tutorial uses Cloud Shell, a command-line environment that runs in the cloud.
From the Google Cloud console, click Activate Cloud Shell in the top right toolbar. It takes a few moments to provision and connect to the environment.
In Cloud Shell, set your
PROJECT_IDandLOCATIONvariables so that all future commands target your specific Google Cloud project.export PROJECT_ID=$(gcloud config get-value project) gcloud config set project $PROJECT_ID export LOCATION="us-central1"Enable the necessary Google Cloud services.
gcloud services enable \ dataplex.googleapis.com \ bigquery.googleapis.com \ datacatalog.googleapis.com
Create a BigQuery dataset and prepare sample data
Use the following code to create a BigQuery dataset and load some sample CSV transactions into a table. After you create the table, Knowledge Catalog discovers it and creates an entry for it in the catalog.
Think of an entry as Knowledge Catalog's representation of a data asset. It's like a record in the catalog that you can attach metadata to. Instead of adding context to (or enriching) the BigQuery table directly, you add it to its entry in Knowledge Catalog.
# Create the BigQuery Dataset in the us-central1 region
bq --location=$LOCATION mk --dataset \
--description "Sample retail data for foundational data context tutorial" \
$PROJECT_ID:retail_data
# Create a temporary CSV file with the sample data
echo "transaction_id,user_email,gmv,transaction_date
1001,test@example.com,150.50,2025-08-28
1002,user@example.com,75.00,2025-08-28" > /tmp/transactions.csv
# Load the data from the temporary CSV file into a BigQuery table
bq load \
--source_format=CSV \
--autodetect \
retail_data.transactions \
/tmp/transactions.csv
# (Optional) Clean up the temporary file
rm /tmp/transactions.csv
Run a SELECT query to verify your setup:
bq query --nouse_legacy_sql "SELECT * FROM retail_data.transactions"
Example output:
+----------------+------------------+-------+------------------+
| transaction_id | user_email | gmv | transaction_date |
+----------------+------------------+-------+------------------+
| 1001 | test@example.com | 150.5 | 2025-08-28 |
| 1002 | user@example.com | 75.0 | 2025-08-28 |
+----------------+------------------+-------+------------------+
Establish common terms with a business glossary
Good data context relies on clear definitions. For example, a developer shouldn't have to guess whether a column named gmv means Gross Merchandise Value or whether it includes taxes and returns. A business glossary creates a single source of truth for these definitions across your organization. When teammates or AI agents analyze your data, they inherit this precise business context. Shared definitions align metrics across teams such as Finance, Sales, and Operations, and help AI agents avoid hallucinations.
Follow these steps to create a glossary and define your first term:
In the Google Cloud console, go to the Knowledge Catalog Glossaries page.
Click Create Business Glossary.
Enter the following details:
- Display name:
Retail Business Glossary - Location:
us-central1 (Iowa)
- Display name:
Click Create.
Click Create Category.
Name the category
Sales Metrics, and click Create.Select the Sales Metrics category and click Add term.
Name the term
Gross Merchandise Valueand click Create.Click the Gross Merchandise Value term to open its details page.
Click Add next to Overview. Enter the following details:
The total value of merchandise sold over a given period of time before the deduction of any fees or expenses. This is a key indicator of e-commerce business growth.Click Save.
You have now created a glossary term that you can link to data entries across your organization.
Define technical metadata with an aspect type
When you use unstructured metadata tags, you often end up with inconsistent catalog entries. For example, one table might be tagged owner:bob and another steward:alice@example.com. To keep your metadata organized at scale, you need a consistent schema.
That's where aspect types come in. An aspect type is a metadata blueprint that lets you set clear rules and required fields. Requiring standard fields like valid email addresses for data stewards lets downstream scripts validate and protect your metadata automatically.
Follow these steps to create an aspect type:
In the Google Cloud console, go to the Knowledge Catalog Aspect types tab on the Metadata types page.
On the Custom tab, click Create.
Enter the following details:
- Display name:
Data Asset Context - Location:
us-central1 (Iowa)
- Display name:
In the Template section, click Add field to create the following three fields:
Field 1:
- Display name:
Data Steward - Type:
Text - Is Required: Select the checkbox.
- Text type:
Plain text
- Display name:
Field 2 (click Add field):
- Display name:
Data Sensitivity - Type:
Enum - Is Required: Leave optional.
- Values: Add
Public,Internal, andConfidential
- Display name:
Field 3 (click Add a field):
- Display name:
Last Review Date - Is Required: Leave optional.
- Type:
Date and time
- Display name:
Click Save.
You now have an aspect type for data governance-related metadata fields like data steward, sensitivity level, and review date. In the next section, you apply this schema to a table entry by attaching an aspect with specific values for these fields.
Enrich an entry with business and technical context
Column names are often abbreviated or ambiguous. Linking a column to a term in your business glossary provides a clear and consistent definition. In this step, you enrich the entry for the retail_data.transactions table by linking the Gross Merchandise Value term to a column named gmv and attaching an aspect to the table entry using your aspect type.
Link a column to a business term
To clarify what the gmv column in retail_data.transactions is, link it to your Gross Merchandise Value term.
In the Google Cloud console, go to the Knowledge Catalog Search page.
Click Filters to open the Filters panel.
For Scope, select Current Project.
Search for
retail_data.transactionsand click the returned transactions table.Click the Schema tab.
Select the checkbox next to the
gmvcolumn, and click Add business term.Select
Gross Merchandise Value.
Attach an aspect to the table entry
In addition to linking business terms to columns, you can attach an aspect to a table entry to capture table-level metadata, such as data ownership and sensitivity.
An aspect is an instance of an aspect type, with specific values for metadata fields. When you attach an aspect to an entry, Knowledge Catalog checks the information you provide against the schema defined in the aspect type to ensure consistency.
To define ownership and sensitivity for the retail_data.transactions table, attach the Data Asset Context aspect:
- On the Details tab of the
retail_data.transactionsentry page, click Add next to Optional aspects. - Select
Data Asset Contextfrom the list. Enter values in the fields:
- Data Steward:
finance-team@example.com - Data Sensitivity: Select Internal.
- Last Review Date: Select today's date.
- Data Steward:
Click Save.
By enriching your sample retail transaction data, you've set up a solid foundation of data context in Knowledge Catalog.
Search for entries using enriched metadata
You can now use Knowledge Catalog Search to find entries based on the business context that you set up. For example, you can find all assets with a specific sensitivity level, or search for your glossary term to discover the underlying tables.
In the Google Cloud console, go to the Knowledge Catalog Search page.
Click Filters to open the Filters panel.
For Scope, select Current Project.
In the search bar, enter
Find tables where the Data Asset Context aspect has Internal sensitivity.You should see your
retail_data.transactionstable in the list of results.Clear the search bar and enter
Find tables with the Gross Merchandise Value term attached.You should again see the
retail_data.transactionstable in the results, as itsgmvcolumn is directly linked to this business term.
When you connect an AI agent to Knowledge Catalog, it inherits this enriched metadata automatically. For example, when you ask an agent to retrieve internal sales metrics, it reads the Data Sensitivity aspect (which you set to Internal) and the linked Gross Merchandise Value glossary term. This shared context helps the agent verify its data sources, respect access policies, and avoid hallucinations.
Clean up
To avoid incurring charges, delete the resources that you created in this tutorial.
Delete the sample dataset
To delete the sample BigQuery dataset and all its tables, use the following command. This action is irreversible.
# Re-run these exports if your Cloud Shell session timed out
export PROJECT_ID=$(gcloud config get-value project)
# Manually type this command to confirm you are deleting the correct dataset
bq rm -r -f --dataset $PROJECT_ID:retail_data
Delete Knowledge Catalog artifacts
In the Google Cloud console, go to the Knowledge Catalog Aspect types tab on the Metadata types page.
Select the
Data Asset Contextaspect type and click Delete.In the Google Cloud console, go to the Knowledge Catalog Glossaries page.
Select the
Gross Merchandise Valueterm and click Delete.Select the
Sales Metricscategory and click Delete.Select the
Retail Business Glossaryand click Delete.
What's next
To learn more about catalog curation and building agents with Knowledge Catalog, see the following resources:
- Manage aspects and enrich metadata: Learn how to define custom schemas and attach structured metadata in Manage aspects and enrich metadata.
- Manage business glossaries: Learn how to establish a standardized vocabulary for your organization in Manage a business glossary.
- Govern with Terraform: Learn how to provision custom aspect types and glossaries using Terraform.
- Work with glossary terms at scale: Perform bulk metadata enrichment using JSON files in About importing and exporting glossaries and entry links.
- Enrich metadata with agents: Build an AI agent to extract context and enrich your data assets in Build an agent to enrich your metadata.
- Explore more use cases: Discover additional hands-on workflows and scenarios in Use cases.