Google 会使用 AI 技术将内容翻译成您偏好的语言。AI 翻译可能包含错误。

使用服务器端自动化搜索多区域沿袭

本文档介绍了如何使用 searchLineageStreaming API 查找多级跨区域数据沿袭。

searchLineageStreaming API 从一组已定义的根实体开始，在指定方向（上游或下游）执行广度优先搜索，并以实时流式传输响应的形式返回统一的沿袭图。

与可能因大型多项目图而超时的标准沿袭查找 API 不同，searchLineageStreaming 可提供实时分块响应。在构建需要遍历广泛、深入或跨区域数据架构且不会发生请求超时的工具时，请使用此 API。

如需了解详情，请参阅多区域沿袭搜索简介。

主要功能

searchLineageStreaming API 包含以下功能：

广度优先搜索：逐层遍历沿袭图，准确计算每个关联资产的深度。
流式响应：在后端系统发现子图和沿袭链接时返回这些内容。对于广泛或深入的沿袭图，此方法非常高效，可防止请求超时。
多位置和多项目遍历：虽然您仅在请求路径中指定了一个结算项目，但只要您拥有所需的权限，API 就会自动发现并遍历多个 Google Cloud 项目和地理位置的沿袭链接。
精细的列级沿袭数据：支持搜索资产之间的列级依赖关系。
通配符查找：通过在完全限定名称 (FQN) 后添加 *，您可以检索特定实体的所有列级沿袭。
流水线数据洞见：选择性地检索有关创建沿袭链接的转换流水线（进程）的元数据。

准备工作

在向 API 发出请求之前，请确保您已满足以下安全和环境前提条件：

所需的角色

如需获得搜索数据沿袭链接所需的权限，请让您的管理员为您授予存储沿袭链接和流程的项目的 Data Lineage Viewer (roles/datalineage.viewer) IAM 角色。如需详细了解如何授予角色，请参阅管理对项目、文件夹和组织的访问权限。

此预定义角色包含搜索数据沿袭链接所需的权限。如需查看所需的确切权限，请展开所需权限部分：

所需权限

您必须拥有以下权限才能搜索数据沿袭链接：

搜索实体级沿袭：存储关联的项目中的 datalineage.events.get
搜索列级沿袭：对存储关联的项目具有 datalineage.events.getFields 权限
检索完整的流水线流程详细信息：对存储流程的项目具有 datalineage.processes.get 权限

您也可以使用自定义角色或其他预定义角色来获取这些权限。

资源范围界定

配置 API 请求时，您必须区分用于管理结算的资源和 API 扫描的实际位置：

结算父级路径：网址请求中的 parent 路径必须采用 projects/project/locations/location 格式。此特定项目-位置对专门用于评估结算配额和 API 速率限制。
目标位置：在请求正文内的 locations 数组中明确定义您希望后端扫描的区域。

身份验证设置

使用 Google Cloud 访问令牌初始化环境变量，以对 curl 命令进行身份验证：

export ACCESS_TOKEN=$(gcloud auth print-access-token)

用法示例

以下示例使用端点 datalineage.googleapis.com。

搜索多级多项目沿袭

如需执行深度沿袭搜索，以遍历图中的多个深度并扫描不同的 Google Cloud 项目，请定义以下变量：

将 limits.maxDepth 设置为目标遍历深度（接受 1 到 100 之间的值）。
使用您希望后端交叉对比的目标区域填充 locations 数组（例如 ["us", "us-east1"]）。

C#

在尝试此示例之前，请按照《Data Lineage 快速入门：使用客户端库》中的 C# 设置说明进行操作。如需了解详情，请参阅 Data Lineage C# API 参考文档。

如需向 Data Lineage 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

using Google.Api.Gax.Grpc;
using Google.Api.Gax.ResourceNames;
using Google.Cloud.DataCatalog.Lineage.V1;
using System.Threading.Tasks;

public sealed partial class GeneratedLineageClientSnippets
{
    /// <summary>Snippet for SearchLineageStreaming</summary>
    /// <remarks>
    /// This snippet has been automatically generated and should be regarded as a code template only.
    /// It will require modifications to work:
    /// - It may require correct/in-range values for request initialization.
    /// - It may require specifying regional endpoints when creating the service client as shown in
    ///   https://cloud.google.com/dotnet/docs/reference/help/client-configuration#endpoint.
    /// </remarks>
    public async Task SearchLineageStreamingRequestObject()
    {
        // Create client
        LineageClient lineageClient = LineageClient.Create();
        // Initialize request argument(s)
        SearchLineageStreamingRequest request = new SearchLineageStreamingRequest
        {
            ParentAsLocationName = LocationName.FromProjectLocation("[PROJECT]", "[LOCATION]"),
            Locations = { "", },
            RootCriteria = new SearchLineageStreamingRequest.Types.RootCriteria(),
            Direction = SearchLineageStreamingRequest.Types.SearchDirection.Unspecified,
            Filters = new SearchLineageStreamingRequest.Types.SearchFilters(),
            Limits = new SearchLineageStreamingRequest.Types.SearchLimits(),
        };
        // Make the request, returning a streaming response
        using LineageClient.SearchLineageStreamingStream response = lineageClient.SearchLineageStreaming(request);

        // Read streaming responses from server until complete
        // Note that C# 8 code can use await foreach
        AsyncResponseStream<SearchLineageStreamingResponse> responseStream = response.GetResponseStream();
        while (await responseStream.MoveNextAsync())
        {
            SearchLineageStreamingResponse responseItem = responseStream.Current;
            // Do something with streamed response
        }
        // The response stream has completed
    }
}

Java

在尝试此示例之前，请按照《Data Lineage 快速入门：使用客户端库》中的 Java 设置说明进行操作。如需了解详情，请参阅 Data Lineage Java API 参考文档。

如需向 Data Lineage 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

import com.google.api.gax.rpc.ServerStream;
import com.google.cloud.datacatalog.lineage.v1.LineageClient;
import com.google.cloud.datacatalog.lineage.v1.LocationName;
import com.google.cloud.datacatalog.lineage.v1.SearchLineageStreamingRequest;
import com.google.cloud.datacatalog.lineage.v1.SearchLineageStreamingResponse;
import java.util.ArrayList;

public class AsyncSearchLineageStreaming {

  public static void main(String[] args) throws Exception {
    asyncSearchLineageStreaming();
  }

  public static void asyncSearchLineageStreaming() throws Exception {
    // This snippet has been automatically generated and should be regarded as a code template only.
    // It will require modifications to work:
    // - It may require correct/in-range values for request initialization.
    // - It may require specifying regional endpoints when creating the service client as shown in
    // https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
    try (LineageClient lineageClient = LineageClient.create()) {
      SearchLineageStreamingRequest request =
          SearchLineageStreamingRequest.newBuilder()
              .setParent(LocationName.of("[PROJECT]", "[LOCATION]").toString())
              .addAllLocations(new ArrayList<String>())
              .setRootCriteria(SearchLineageStreamingRequest.RootCriteria.newBuilder().build())
              .setFilters(SearchLineageStreamingRequest.SearchFilters.newBuilder().build())
              .setLimits(SearchLineageStreamingRequest.SearchLimits.newBuilder().build())
              .build();
      ServerStream<SearchLineageStreamingResponse> stream =
          lineageClient.searchLineageStreamingCallable().call(request);
      for (SearchLineageStreamingResponse response : stream) {
        // Do something when a response is received.
      }
    }
  }
}

Node.js

在尝试此示例之前，请按照《Data Lineage 快速入门：使用客户端库》中的 Node.js 设置说明进行操作。如需了解详情，请参阅 Data Lineage Node.js API 参考文档。

如需向 Data Lineage 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

/**
 * This snippet has been automatically generated and should be regarded as a code template only.
 * It will require modifications to work.
 * It may require correct/in-range values for request initialization.
 * TODO(developer): Uncomment these variables before running the sample.
 */
/**
 *  Required. The project and location to initiate the search from.
 */
// const parent = 'abc123'
/**
 *  Required. The locations to search in.
 */
// const locations = ['abc','def']
/**
 *  Required. Criteria for the root of the search.
 */
// const rootCriteria = {}
/**
 *  Required. Direction of the search.
 */
// const direction = {}
/**
 *  Optional. Filters for the search.
 */
// const filters = {}
/**
 *  Optional. Limits for the search.
 */
// const limits = {}

// Imports the Lineage library
const {LineageClient} = require('@google-cloud/lineage').v1;

// Instantiates a client
const lineageClient = new LineageClient();

async function callSearchLineageStreaming() {
  // Construct request
  const request = {
    parent,
    locations,
    rootCriteria,
    direction,
  };

  // Run request
  const stream = await lineageClient.searchLineageStreaming(request);
  stream.on('data', (response) => { console.log(response) });
  stream.on('error', (err) => { throw(err) });
  stream.on('end', () => { /* API call completed */ });
}

callSearchLineageStreaming();

Python

在尝试此示例之前，请按照《Data Lineage 快速入门：使用客户端库》中的 Python 设置说明进行操作。如需了解详情，请参阅 Data Lineage Python API 参考文档。

如需向 Data Lineage 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

# This snippet has been automatically generated and should be regarded as a
# code template only.
# It will require modifications to work:
# - It may require correct/in-range values for request initialization.
# - It may require specifying regional endpoints when creating the service
#   client as shown in:
#   https://googleapis.dev/python/google-api-core/latest/client_options.html
from google.cloud import datacatalog_lineage_v1


def sample_search_lineage_streaming():
    # Create a client
    client = datacatalog_lineage_v1.LineageClient()

    # Initialize request argument(s)
    request = datacatalog_lineage_v1.SearchLineageStreamingRequest(
        parent="parent_value",
        locations=["locations_value1", "locations_value2"],
        direction="UPSTREAM",
    )

    # Make the request
    stream = client.search_lineage_streaming(request=request)

    # Handle the response
    for response in stream:
        print(response)

Ruby

在尝试此示例之前，请按照《Data Lineage 快速入门：使用客户端库》中的 Ruby 设置说明进行操作。如需了解详情，请参阅 Data Lineage Ruby API 参考文档。

如需向 Data Lineage 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

require "google/cloud/data_catalog/lineage/v1"

##
# Snippet for the search_lineage_streaming call in the Lineage service
#
# This snippet has been automatically generated and should be regarded as a code
# template only. It will require modifications to work:
# - It may require correct/in-range values for request initialization.
# - It may require specifying regional endpoints when creating the service
# client as shown in https://cloud.google.com/ruby/docs/reference.
#
# This is an auto-generated example demonstrating basic usage of
# Google::Cloud::DataCatalog::Lineage::V1::Lineage::Client#search_lineage_streaming.
#
def search_lineage_streaming
  # Create a client object. The client can be reused for multiple calls.
  client = Google::Cloud::DataCatalog::Lineage::V1::Lineage::Client.new

  # Create a request. To set request fields, pass in keyword arguments.
  request = Google::Cloud::DataCatalog::Lineage::V1::SearchLineageStreamingRequest.new

  # Call the search_lineage_streaming method to start streaming.
  output = client.search_lineage_streaming request

  # The returned object is a streamed enumerable yielding elements of type
  # ::Google::Cloud::DataCatalog::Lineage::V1::SearchLineageStreamingResponse
  output.each do |current_response|
    p current_response
  end
end

REST

如需搜索数据沿袭，请使用 searchLineageStreaming 方法。

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：用于管理结算和配额评估的 Google Cloud 项目 ID。
LOCATION_ID： Google Cloud 位置，例如 us-central1。
SOURCE_PROJECT_ID：源表所在的 Google Cloud 项目 ID。
DATASET_ID：BigQuery 数据集 ID。
TABLE_ID：BigQuery 表 ID。

HTTP 方法和网址：

POST https://datalineage.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID:searchLineageStreaming

请求 JSON 正文：

{
  "parent": "projects/PROJECT_ID/locations/LOCATION_ID",
  "locations": [
    "LOCATION_ID",
    "us-east1",
    "us-central1"
  ],
  "rootCriteria": {
    "entities": {
      "entities": [
        {
          "fullyQualifiedName": "bigquery:SOURCE_PROJECT_ID.DATASET_ID.TABLE_ID"
        }
      ]
    }
  },
  "direction": "DOWNSTREAM",
  "limits": {
    "maxDepth": 10,
    "maxResults": 5000
  }
}

如需发送您的请求，请展开以下选项之一：

curl（Linux、macOS 或 Cloud Shell）

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI，或者使用了 Cloud Shell，这会使您自动登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://datalineage.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID:searchLineageStreaming"

PowerShell (Windows)

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://datalineage.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID:searchLineageStreaming" | Select-Object -Expand Content

您应该收到类似以下内容的 JSON 响应：

{
  "links": [
    {
      "source": {
        "fullyQualifiedName": "bigquery:project-prod.dataset.source_table"
      },
      "target": {
        "fullyQualifiedName": "bigquery:project-prod.dataset.target_table"
      },
      "depth": 1,
      "location": "us"
    }
  ]
}

搜索多个地理位置

您可以修改 locations 重复数组字段中传递的地理区域，从而限制或扩大沿袭图扫描范围。

C#

在尝试此示例之前，请按照《Data Lineage 快速入门：使用客户端库》中的 C# 设置说明进行操作。如需了解详情，请参阅 Data Lineage C# API 参考文档。

如需向 Data Lineage 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

using Google.Api.Gax.Grpc;
using Google.Api.Gax.ResourceNames;
using Google.Cloud.DataCatalog.Lineage.V1;
using System.Threading.Tasks;

public sealed partial class GeneratedLineageClientSnippets
{
    /// <summary>Snippet for SearchLineageStreaming</summary>
    /// <remarks>
    /// This snippet has been automatically generated and should be regarded as a code template only.
    /// It will require modifications to work:
    /// - It may require correct/in-range values for request initialization.
    /// - It may require specifying regional endpoints when creating the service client as shown in
    ///   https://cloud.google.com/dotnet/docs/reference/help/client-configuration#endpoint.
    /// </remarks>
    public async Task SearchLineageStreamingRequestObject()
    {
        // Create client
        LineageClient lineageClient = LineageClient.Create();
        // Initialize request argument(s)
        SearchLineageStreamingRequest request = new SearchLineageStreamingRequest
        {
            ParentAsLocationName = LocationName.FromProjectLocation("[PROJECT]", "[LOCATION]"),
            Locations = { "", },
            RootCriteria = new SearchLineageStreamingRequest.Types.RootCriteria(),
            Direction = SearchLineageStreamingRequest.Types.SearchDirection.Unspecified,
            Filters = new SearchLineageStreamingRequest.Types.SearchFilters(),
            Limits = new SearchLineageStreamingRequest.Types.SearchLimits(),
        };
        // Make the request, returning a streaming response
        using LineageClient.SearchLineageStreamingStream response = lineageClient.SearchLineageStreaming(request);

        // Read streaming responses from server until complete
        // Note that C# 8 code can use await foreach
        AsyncResponseStream<SearchLineageStreamingResponse> responseStream = response.GetResponseStream();
        while (await responseStream.MoveNextAsync())
        {
            SearchLineageStreamingResponse responseItem = responseStream.Current;
            // Do something with streamed response
        }
        // The response stream has completed
    }
}

Java

在尝试此示例之前，请按照《Data Lineage 快速入门：使用客户端库》中的 Java 设置说明进行操作。如需了解详情，请参阅 Data Lineage Java API 参考文档。

如需向 Data Lineage 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

import com.google.api.gax.rpc.ServerStream;
import com.google.cloud.datacatalog.lineage.v1.LineageClient;
import com.google.cloud.datacatalog.lineage.v1.LocationName;
import com.google.cloud.datacatalog.lineage.v1.SearchLineageStreamingRequest;
import com.google.cloud.datacatalog.lineage.v1.SearchLineageStreamingResponse;
import java.util.ArrayList;

public class AsyncSearchLineageStreaming {

  public static void main(String[] args) throws Exception {
    asyncSearchLineageStreaming();
  }

  public static void asyncSearchLineageStreaming() throws Exception {
    // This snippet has been automatically generated and should be regarded as a code template only.
    // It will require modifications to work:
    // - It may require correct/in-range values for request initialization.
    // - It may require specifying regional endpoints when creating the service client as shown in
    // https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
    try (LineageClient lineageClient = LineageClient.create()) {
      SearchLineageStreamingRequest request =
          SearchLineageStreamingRequest.newBuilder()
              .setParent(LocationName.of("[PROJECT]", "[LOCATION]").toString())
              .addAllLocations(new ArrayList<String>())
              .setRootCriteria(SearchLineageStreamingRequest.RootCriteria.newBuilder().build())
              .setFilters(SearchLineageStreamingRequest.SearchFilters.newBuilder().build())
              .setLimits(SearchLineageStreamingRequest.SearchLimits.newBuilder().build())
              .build();
      ServerStream<SearchLineageStreamingResponse> stream =
          lineageClient.searchLineageStreamingCallable().call(request);
      for (SearchLineageStreamingResponse response : stream) {
        // Do something when a response is received.
      }
    }
  }
}

Node.js

Java

在尝试此示例之前，请按照《Data Lineage 快速入门：使用客户端库》中的 Java 设置说明进行操作。

如需向 Data Lineage 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

/**
 * This snippet has been automatically generated and should be regarded as a code template only.
 * It will require modifications to work.
 * It may require correct/in-range values for request initialization.
 * TODO(developer): Uncomment these variables before running the sample.
 */
/**
 *  Required. The project and location to initiate the search from.
 */
// const parent = 'abc123'
/**
 *  Required. The locations to search in.
 */
// const locations = ['abc','def']
/**
 *  Required. Criteria for the root of the search.
 */
// const rootCriteria = {}
/**
 *  Required. Direction of the search.
 */
// const direction = {}
/**
 *  Optional. Filters for the search.
 */
// const filters = {}
/**
 *  Optional. Limits for the search.
 */
// const limits = {}

// Imports the Lineage library
const {LineageClient} = require('@google-cloud/lineage').v1;

// Instantiates a client
const lineageClient = new LineageClient();

async function callSearchLineageStreaming() {
  // Construct request
  const request = {
    parent,
    locations,
    rootCriteria,
    direction,
  };

  // Run request
  const stream = await lineageClient.searchLineageStreaming(request);
  stream.on('data', (response) => { console.log(response) });
  stream.on('error', (err) => { throw(err) });
  stream.on('end', () => { /* API call completed */ });
}

callSearchLineageStreaming();

Python

在尝试此示例之前，请按照《Data Lineage 快速入门：使用客户端库》中的 Python 设置说明进行操作。如需了解详情，请参阅 Data Lineage Python API 参考文档。

如需向 Data Lineage 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

# This snippet has been automatically generated and should be regarded as a
# code template only.
# It will require modifications to work:
# - It may require correct/in-range values for request initialization.
# - It may require specifying regional endpoints when creating the service
#   client as shown in:
#   https://googleapis.dev/python/google-api-core/latest/client_options.html
from google.cloud import datacatalog_lineage_v1


def sample_search_lineage_streaming():
    # Create a client
    client = datacatalog_lineage_v1.LineageClient()

    # Initialize request argument(s)
    request = datacatalog_lineage_v1.SearchLineageStreamingRequest(
        parent="parent_value",
        locations=["locations_value1", "locations_value2"],
        direction="UPSTREAM",
    )

    # Make the request
    stream = client.search_lineage_streaming(request=request)

    # Handle the response
    for response in stream:
        print(response)

Ruby

在尝试此示例之前，请按照《Data Lineage 快速入门：使用客户端库》中的 Ruby 设置说明进行操作。如需了解详情，请参阅 Data Lineage Ruby API 参考文档。

如需向 Data Lineage 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

require "google/cloud/data_catalog/lineage/v1"

##
# Snippet for the search_lineage_streaming call in the Lineage service
#
# This snippet has been automatically generated and should be regarded as a code
# template only. It will require modifications to work:
# - It may require correct/in-range values for request initialization.
# - It may require specifying regional endpoints when creating the service
# client as shown in https://cloud.google.com/ruby/docs/reference.
#
# This is an auto-generated example demonstrating basic usage of
# Google::Cloud::DataCatalog::Lineage::V1::Lineage::Client#search_lineage_streaming.
#
def search_lineage_streaming
  # Create a client object. The client can be reused for multiple calls.
  client = Google::Cloud::DataCatalog::Lineage::V1::Lineage::Client.new

  # Create a request. To set request fields, pass in keyword arguments.
  request = Google::Cloud::DataCatalog::Lineage::V1::SearchLineageStreamingRequest.new

  # Call the search_lineage_streaming method to start streaming.
  output = client.search_lineage_streaming request

  # The returned object is a streamed enumerable yielding elements of type
  # ::Google::Cloud::DataCatalog::Lineage::V1::SearchLineageStreamingResponse
  output.each do |current_response|
    p current_response
  end
end

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：用于管理结算和配额评估的 Google Cloud 项目 ID。
LOCATION_ID： Google Cloud 位置，例如 us-central1。
SOURCE_PROJECT_ID：源表所在的 Google Cloud 项目 ID。
DATASET_ID：BigQuery 数据集 ID。
TABLE_ID：BigQuery 表 ID。

HTTP 方法和网址：

POST https://datalineage.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID:searchLineageStreaming

请求 JSON 正文：

{
  "parent": "projects/PROJECT_ID/locations/LOCATION_ID",
  "locations": [
    "LOCATION_ID",
    "us-east1",
    "us-central1"
  ],
  "rootCriteria": {
    "entities": {
      "entities": [
        {
          "fullyQualifiedName": "bigquery:SOURCE_PROJECT_ID.DATASET_ID.TABLE_ID"
        }
      ]
    }
  },
  "direction": "DOWNSTREAM",
  "limits": {
    "maxDepth": 10,
    "maxResults": 5000
  }
}

如需发送您的请求，请展开以下选项之一：

curl（Linux、macOS 或 Cloud Shell）

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://datalineage.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID:searchLineageStreaming"

PowerShell (Windows)

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://datalineage.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID:searchLineageStreaming" | Select-Object -Expand Content

您应该收到类似以下内容的 JSON 响应：

{
  "links": [
    {
      "source": {
        "fullyQualifiedName": "bigquery:project-prod.dataset.source_table"
      },
      "target": {
        "fullyQualifiedName": "bigquery:project-prod.dataset.target_table"
      },
      "depth": 1,
      "location": "us"
    }
  ]
}

检索沿袭链接的进程名称

默认情况下，API 会省略进程信息（maxProcessPerLink 默认为 0）。如需检索创建数据链接的流水线的资源名称，请将 limits.maxProcessPerLink 配置为非零正整数。

例如：

Java

在尝试此示例之前，请按照《Data Lineage 快速入门：使用客户端库》中的 Java 设置说明进行操作。如需了解详情，请参阅 Data Lineage Java API 参考文档。

如需向 Data Lineage 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

import com.google.cloud.datacatalog.lineage.v1.LineageClient;
import com.google.cloud.datacatalog.lineage.v1.Process;
import com.google.cloud.datacatalog.lineage.v1.ProcessName;

public class SyncGetProcessProcessname {

  public static void main(String[] args) throws Exception {
    syncGetProcessProcessname();
  }

  public static void syncGetProcessProcessname() throws Exception {
    // This snippet has been automatically generated and should be regarded as a code template only.
    // It will require modifications to work:
    // - It may require correct/in-range values for request initialization.
    // - It may require specifying regional endpoints when creating the service client as shown in
    // https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_library
    try (LineageClient lineageClient = LineageClient.create()) {
      ProcessName name = ProcessName.of("[PROJECT]", "[LOCATION]", "[PROCESS]");
      Process response = lineageClient.getProcess(name);
    }
  }
}

REST

在使用任何请求数据之前，请先进行以下替换：

BILLING_PROJECT_ID：用于管理结算和配额评估的项目。 Google Cloud 提供项目编号或项目 ID。
LOCATION_ID：执行谱系搜索的 Google Cloud 位置。
FULLY_QUALIFIED_NAME：目标实体的完全限定名称 (FQN)，格式为 bigquery:PROJECT_ID.DATASET_ID.TABLE_ID。

HTTP 方法和网址：

POST https://datalineage.googleapis.com/v1/projects/BILLING_PROJECT_ID/locations/LOCATION_ID:searchLineageStreaming

请求 JSON 正文：

{
  "parent": "projects/BILLING_PROJECT_ID/locations/LOCATION_ID",
  "locations": [
    "LOCATION_ID"
  ],
  "rootCriteria": {
    "entities": {
      "entities": [
        {
          "fullyQualifiedName": "FULLY_QUALIFIED_NAME"
        }
      ]
    }
  },
  "direction": "UPSTREAM",
  "limits": {
    "maxProcessPerLink": 5
  }
}

如需发送您的请求，请展开以下选项之一：

curl（Linux、macOS 或 Cloud Shell）

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://datalineage.googleapis.com/v1/projects/BILLING_PROJECT_ID/locations/LOCATION_ID:searchLineageStreaming"

PowerShell (Windows)

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://datalineage.googleapis.com/v1/projects/BILLING_PROJECT_ID/locations/LOCATION_ID:searchLineageStreaming" | Select-Object -Expand Content

响应行为：生成的流会使用仅包含绝对系统资源名称（例如 projects/my-project/locations/us/processes/my-process）的进程消息填充 links[].processes 字段。

使用 FieldMask 检索完整流程详情

如果您需要流水线的完整结构化元数据（例如其 displayName、系统 attributes 或执行 origin），而不仅仅是其资源名称，则必须使用 API FieldMask：

为 limits.maxProcessPerLink 提供一个非零值。
将 fields 查询参数附加到网址路径，并指定 links.processes.process 以及其他必需字段。

例如：

curl -H "Authorization: Bearer ${ACCESS_TOKEN}" \
-H "Content-Type: application/json" \
-X POST "https://datalineage.googleapis.com/v1/projects/my-billing-project/locations/us:searchLineageStreaming?fields=links.processes.process,links.source,links.target,links.depth" \
--data '{
  "parent": "projects/my-billing-project/locations/us",
  "locations": ["us"],
  "rootCriteria": {
    "entities": {
      "entities": [{
        "fullyQualifiedName": "bigquery:my-project.dataset.target_table"
      }]
    }
  },
  "direction": "UPSTREAM",
  "limits": {
    "maxProcessPerLink": 5
  }
}'

同时搜索表级沿袭和列级沿袭

您可以在单个请求中搜索表级（资产级）和列级（字段级）沿袭，只需在 rootCriteria.entities.entities 列表中提供多个实体即可：

对于表级层沿袭，请省略 field 数组。
对于列级沿袭，请在 field 数组中指定单个列。

例如：

curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json" \
     -X POST https://datalineage.googleapis.com/v1/projects/my-billing-project/locations/us:searchLineageStreaming \
     --data '{
       "parent": "projects/my-billing-project/locations/us",
       "locations": ["us"],
       "rootCriteria": {
         "entities": {
           "entities": [
             {
               "fullyQualifiedName": "bigquery:my-project.dataset.table_a"
             },
             {
               "fullyQualifiedName": "bigquery:my-project.dataset.table_b",
               "field": ["email"]
             }
           ]
         }
       },
       "direction": "DOWNSTREAM"
     }'

为列级沿袭数据使用通配符

如需搜索特定表的所有可用列级沿袭，而无需单独列出每个列，请使用通配符 * 作为 field 数组中的单个值。

例如：

curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json" \
     -X POST https://datalineage.googleapis.com/v1/projects/my-billing-project/locations/us:searchLineageStreaming \
     --data '{
       "parent": "projects/my-billing-project/locations/us",
       "locations": ["us"],
       "rootCriteria": {
         "entities": {
           "entities": [{
             "fullyQualifiedName": "bigquery:my-project.dataset.my_table",
             "field": ["*"]
           }]
         }
       },
       "direction": "DOWNSTREAM"
     }'

过滤谱系结果

您可以使用请求正文中的 filters 区块来优化沿袭搜索结果。

按依赖关系类型过滤

如需将结果限制为特定依赖项类型，例如直接复制 (EXACT_COPY) 或过滤和分组等转换 (OTHER)，请使用 dependencyTypes 过滤条件。

例如：

curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json" \
     -X POST https://datalineage.googleapis.com/v1/projects/my-billing-project/locations/us:searchLineageStreaming \
     --data '{
       "parent": "projects/my-billing-project/locations/us",
       "locations": ["us"],
       "rootCriteria": {
         "entities": {
           "entities": [{
             "fullyQualifiedName": "bigquery:my-project.dataset.my_table"
           }]
         }
       },
       "direction": "DOWNSTREAM",
       "filters": {
         "dependencyTypes": ["EXACT_COPY"]
       }
     }'

排除列级沿袭（仅限表搜索）

如需确保搜索仅返回表级沿袭数据，并完全排除列级沿袭数据，请将 entitySet 过滤条件设置为 ENTITIES。

例如：

curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json" \
     -X POST https://datalineage.googleapis.com/v1/projects/my-billing-project/locations/us:searchLineageStreaming \
     --data '{
       "parent": "projects/my-billing-project/locations/us",
       "locations": ["us"],
       "rootCriteria": {
         "entities": {
           "entities": [{
             "fullyQualifiedName": "bigquery:my-project.dataset.my_table"
           }]
         }
       },
       "direction": "DOWNSTREAM",
       "filters": {
         "entitySet": "ENTITIES"
       }
     }'

按时间范围过滤

您可以将谱系搜索结果限制在特定时间段内。

例如，如需搜索在特定时间戳之后创建的沿袭数据，请使用以下请求：

curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json" \
     -X POST https://datalineage.googleapis.com/v1/projects/my-billing-project/locations/us:searchLineageStreaming \
     --data '{
       "parent": "projects/my-billing-project/locations/us",
       "locations": ["us"],
       "rootCriteria": {
         "entities": {
           "entities": [{
             "fullyQualifiedName": "bigquery:my-project.dataset.my_table"
           }]
         }
       },
       "direction": "DOWNSTREAM",
       "filters": {
         "timeRange": {
           "startTime": "2026-01-01T00:00:00Z"
         }
       }
     }'

问题排查：处理无法到达的位置和部分图表

由于流式 API 会同时扫描一组分布式项目和位置，因此在执行期间，某些远程区域可能会暂时停机、无法通信或配置错误。

症状：返回的沿袭图布局显示不完整，或者缺少预期的区域跳数。
诊断：为保护数据完整性，searchLineageStreamingResponse 流会使用 projects/PROJECT_NUMBER/locations/LOCATION 格式（例如 projects/123456789/locations/us-east1）在专用 unreachable 字段（重复字符串）中填充有问题的位置。
最佳实践：在处理图之前，请务必构建客户端应用来检查 unreachable 字段，以验证数据的完整性。

后续步骤

详细了解多区域沿袭搜索。
详细了解数据沿袭。
详细了解沿袭可视化图表。

使用服务器端自动化搜索多区域沿袭 使用集合让一切井井有条 根据您的偏好保存内容并对其进行分类。

主要功能

准备工作

所需的角色

所需权限

资源范围界定

身份验证设置

用法示例

搜索多级多项目沿袭

C#

C#

Java

Java

Node.js

Node.js

Python

Python

Ruby

Ruby

REST

curl（Linux、macOS 或 Cloud Shell）

PowerShell (Windows)

搜索多个地理位置

C#

C#

Java

Java

Node.js

Java

Python

Python

Ruby

Ruby

REST

curl（Linux、macOS 或 Cloud Shell）

PowerShell (Windows)

检索沿袭链接的进程名称

Java

Java

REST

curl（Linux、macOS 或 Cloud Shell）

PowerShell (Windows)

使用 FieldMask 检索完整流程详情

同时搜索表级沿袭和列级沿袭

为列级沿袭数据使用通配符

过滤谱系结果

按依赖关系类型过滤

排除列级沿袭（仅限表搜索）

按时间范围过滤

问题排查：处理无法到达的位置和部分图表

后续步骤

使用服务器端自动化搜索多区域沿袭