As of April 10, 2026, Dataplex Universal Catalog is now called Knowledge Catalog. The API, client library, CLI, and IAM names remain unchanged. For more information, see Introducing the Google Cloud Knowledge Catalog.

Knowledge Catalog FAQ

This document answers frequently asked questions about Google Cloud Knowledge Catalog (formerly Dataplex Universal Catalog).

For more information about Knowledge Catalog, see Knowledge Catalog overview.

General concepts

Explore frequently asked questions about Knowledge Catalog features and its relationship to Data Catalog.

What is Knowledge Catalog?

Google Knowledge Catalog is an intelligent governance solution for data and AI assets in Google Cloud. It provides a centralized inventory where you can discover, manage, and govern your data across Google Cloud data sources like BigQuery, Cloud Storage, Pub/Sub, and Spanner.

Knowledge Catalog uses AI to automate data discovery, metadata enrichment, and data quality. Through its governed data catalog, Knowledge Catalog provides the essential grounding that AI agents need to generate high-quality content.

What is Data Catalog?

Data Catalog was the original name of Google Cloud's metadata service. Over time, it evolved into Dataplex Universal Catalog, and it has now been renamed and evolved into Knowledge Catalog.

While the term "Data Catalog" is still used to describe this type of data indexing, in the context of Google Cloud, it refers to our legacy product. Use Knowledge Catalog for all new projects to take advantage of AI-powered features and enhanced governance.

Is Knowledge Catalog different from Data Catalog?

Yes, Knowledge Catalog is the AI-powered data governance platform that eventually replaces the existing Data Catalog. While they share similar concepts, Knowledge Catalog provides several enhancements:

AI-powered context: Unlike Data Catalog, Knowledge Catalog uses Gemini to automatically extract business context, generate natural language descriptions, and provide SQL "golden queries" to ground AI agents.
Rich metadata support: Knowledge Catalog supports more complex metadata types, such as nested arrays, maps, and records.
Agentic access: AI agents can discover and adaptively use Knowledge Catalog tools through a local or remote MCP server.
Data discovery: Knowledge Catalog can auto-ingest metadata from a larger set of Google Cloud services and external data sources.
Governance at scale: It offers enhanced capabilities for data profiling, automatic data quality, and centralized governance.

What is Knowledge Catalog used for?

Google Knowledge Catalog solves the "data cold start" problem—the time wasted trying to find, understand, and trust data before you can actually use it. Its primary uses include the following:

Accelerated data discovery: Instead of navigating complex organizational silos to locate data, you can use natural language search (for example, "Show me the most recent customer churn data") to find assets across Google Cloud resources instantly, which increases productivity for data consumers.
Grounding AI agents: It acts as the "source of truth" for generative AI or ADK. By linking physical data to business definitions, it ensures that AI agents (like those built on Vertex AI) use high-quality data, which significantly reduces AI hallucinations and improves trust in AI-generated insights.
Automated data governance: It automatically scans your data to identify sensitive information (that is, PII), tracks where data comes from (lineage), and monitors its accuracy (automatic data quality). These capabilities help improve data trust, security, and compliance with less manual effort.
Discovering "dark data": It can scan unstructured files (for example, PDFs or images in Cloud Storage), extract the information inside, and make it searchable and queryable in BigQuery, which helps you unlock insights from previously inaccessible data.

To get started with these scenarios, see Knowledge Catalog use cases.

What types of metadata does Knowledge Catalog store?

Knowledge Catalog stores three types of metadata:

Technical metadata: Automatically harvested schemas, table names, and system properties.
Business metadata: User-defined context such as business descriptions, glossary terms, and ownership.
Runtime metadata: Information about data lineage, data quality scores, and data profiling statistics.

Migration from Data Catalog

Explore frequently asked questions about transition timelines, impact details, and guidance for moving metadata workflows from Data Catalog to Knowledge Catalog.

How do I migrate from Data Catalog?

The transition to Knowledge Catalog is designed to be seamless, with no manual data movement required. Depending on your current usage, the process involves two main phases:

Preparatory phase: If you have custom metadata (for example, tags, tag templates, or custom entries), this content is automatically brought into Knowledge Catalog as read-only. During this phase, you perform configuration tasks to make your existing Data Catalog content simultaneously available in the new interface.
Transfer phase: Once prepared, you transfer the active state of your metadata to make it read-write within Knowledge Catalog. You should coordinate this step with updating any programmatic workloads (APIs, client libraries, or Terraform modules) to point to the new Knowledge Catalog endpoints.

If you have no custom metadata or if you are new to the platform, you can complete the transition by setting Knowledge Catalog as your default UI experience in the Google Cloud console.

For more information, see Transition from Data Catalog to Knowledge Catalog.

Data insights billing

Explore frequently asked questions about token-based billing, usage metering, and administrative controls for the data insights feature in Knowledge Catalog.

When will I start being billed for using data insights?

Active billing for data insights starts on October 27, 2026.

How is data insights usage billed?

Usage is measured in data tokens. Data tokens represent the basic units of text and metadata processed by the Gemini model. These tokens are divided into two categories:

Input tokens: The context, instructions, and history sent to the model.
Output tokens: The generated natural language descriptions, relationships, and queries returned by the model.

For more information, see Data Cloud Agent pricing.

Do service accounts receive a free quota for data insights?

No. Service accounts don't receive a free per-user quota allowance. However, their usage is metered in the same way as standard users.

How can I prevent automatic charging for data insights?

To avoid incurring charges, verify your ongoing scans and stop triggering new scans. Administrators can also turn off the data insights capability by disabling the Dataplex API in the respective project.

How can I find out how much I am going to be charged for data insights?

Google Cloud provides metering for data tokens. This metering lets administrators and agent owners monitor their token consumption. This metering capability is available before active billing starts in October 2026.

Knowledge Catalog FAQ Stay organized with collections Save and categorize content based on your preferences.