Use the data lineage remote MCP server

This document shows you how to use the data lineage remote Model Context Protocol (MCP) server to connect with AI applications including Gemini CLI, ChatGPT, Claude, and custom applications you are developing. The data lineage remote MCP server lets you interact with data lineage to query data lineage graphs, discover upstream data provenance, and analyze downstream impact. The Data Lineage API remote MCP server is enabled when you enable Data Lineage API.

Model Context Protocol (MCP) standardizes how large language models (LLMs) and AI applications or agents connect to external data sources. MCP servers let you use their tools, resources, and prompts to take actions and get updated data from their backend service.

What's the difference between local and remote MCP servers?

Local MCP servers
Typically run on your local machine and use the standard input and output streams (stdio) for communication between services on the same device.
Remote MCP servers
Run on the service's infrastructure and offer an HTTP endpoint to AI applications for communication between the AI MCP client and the MCP server. For more information about MCP architecture, see MCP architecture.

Google and Google Cloud remote MCP servers

Google and Google Cloud remote MCP servers have the following features and benefits:

  • Simplified, centralized discovery
  • Managed global or regional HTTP endpoints
  • Fine-grained authorization
  • Optional prompt and response security with Model Armor protection
  • Centralized audit logging

For information about other MCP servers and information about security and governance controls available for Google Cloud MCP servers, see Google Cloud MCP servers overview.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project.

  4. Enable the Data Lineage API.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

    Enable the API

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

    Go to project selector

  6. Verify that billing is enabled for your Google Cloud project.

  7. Enable the Data Lineage API.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

    Enable the API

Required roles

To get the permissions that you need to use the data lineage MCP server, ask your administrator to grant you the following IAM roles on the project where you want to use the data lineage MCP server:

For more information about granting roles, see Manage access to projects, folders, and organizations.

These predefined roles contain the permissions required to use the data lineage MCP server. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to use the data lineage MCP server:

  • Make MCP tool calls: mcp.tools.call
  • Query data lineage in search for links: datalineage.locations.searchLinks

You might also be able to get these permissions with custom roles or other predefined roles.

Authentication and authorization

The Data Lineage API remote MCP server uses the OAuth 2.0 protocol with Identity and Access Management (IAM) for authentication and authorization. All Google Cloud identities are supported for authentication to MCP servers.

We recommend that you create a separate identity for agents that are using MCP tools so that access to resources can be controlled and monitored. For more information about authentication, see Authenticate to MCP servers.

Data lineage MCP OAuth scopes

OAuth 2.0 uses scopes and credentials to determine if an authenticated principal is authorized to take a specific action on a resource. For more information about OAuth 2.0 scopes at Google, read Using OAuth 2.0 to access Google APIs.

Data lineage has the following MCP tool OAuth scopes:

Scope URI for gcloud CLI Description
https://www.googleapis.com/auth/datalineage.readonly Only allows access to read data.
https://www.googleapis.com/auth/datalineage.read-write Allows access to read and modify data.

Additional scopes might be required on the resources accessed during a tool call. To view a list of scopes required for data lineage, see Data Lineage API.

Configure an MCP client to use the data lineage MCP server

AI applications and agents, such as Claude or Gemini CLI, can instantiate an MCP client that connects to a single MCP server. An AI application can have multiple clients that connect to different MCP servers. To connect to a remote MCP server, the MCP client must know the remote MCP server's URL.

In your AI application, look for a way to connect to a remote MCP server. You are prompted to enter details about the server, such as its name and URL.

For the data lineage MCP server, enter the following as required:

  • Server name: data lineage MCP server
  • Server URL or Endpoint:
    • Global endpoint: https://datalineage.googleapis.com/mcp
    • Regional endpoints: https://REGION-datalineage.googleapis.com/mcp. Replace REGION with the one of the supported regions.
  • Transport: HTTP
  • Authentication details: Depending on how you want to authenticate, you can enter your Google Cloud credentials, your OAuth Client ID and secret, or an agent identity and credentials. For more information about authentication, see Authenticate to MCP servers.
  • OAuth scope: the OAuth 2.0 scope that you want to use when connecting to the data lineage MCP server.

For host-specific guidance about setting up and connecting to MCP server, see the following:

For more general guidance, see the following resources:

Available tools

To view details of available MCP tools and their descriptions for the data lineage MCP server, see the data lineage MCP reference.

List tools

Use the MCP inspector to list tools, or send a tools/list HTTP request directly to the data lineage remote MCP server. The tools/list method doesn't require authentication.

POST /mcp HTTP/1.1
Host: datalineage.googleapis.com
Content-Type: application/json

{
  "method": "tools/list",
  "jsonrpc": "2.0",
  "id": 1
}

Example use cases

Example use cases for the data lineage MCP server include:

  • Discovering all upstream data sources and transformation processes that feed into a specific data asset to verify data origin and accuracy.
  • Analyzing the impact of broken, stalled, or delayed data pipelines on downstream data consumers.

Sample prompts

  • "In my project my-analytics-project, I have a dataset sales_data with a table called monthly_reports. Tell me all the data assets and transformation processes that feed data into this table."
  • "I have a BigQuery job that writes into the hr_dataset.salary table. I see the job has been failing to run for 12 hours now. Can you tell me which downstream assets will have stale data because of this issue?"
  • "Go through the monthly_reports table in sales_data dataset and my-analytics-project project to find all the columns that have upstream data sources, and give me all the processes that feed into these columns."
  • "Search for lineage links connected to table finance.employment_costs to understand its upstream dependencies."

Optional security and safety configurations

MCP introduces new security risks and considerations due to the wide variety of actions that you can do with the MCP tools. To minimize and manage these risks, Google Cloud offers default settings and customizable policies to control the use of MCP tools in your Google Cloud organization or project.

For more information about MCP security and governance, see AI security and safety.

Control MCP use with IAM deny policies

Identity and Access Management (IAM) deny policies help you secure Google Cloud remote MCP servers. Configure these policies to block unwanted MCP tool access.

For example, you can deny or allow access based on:

  • The principal
  • Tool properties like read-only
  • The application's OAuth client ID

For more information, see Control MCP use with Identity and Access Management.

What's next