Document AI Warehouse 客户端库

本页面介绍了如何开始使用 Document AI Warehouse API 的 Cloud 客户端库。通过客户端库,您可以更轻松地使用支持的语言访问 Google Cloud API。虽然您可以通过向服务器发出原始请求来直接使用Google Cloud API,但客户端库可实现简化,从而显著减少您需要编写的代码量。

请参阅客户端库说明,详细了解 Cloud 客户端库 和旧版 Google API 客户端库。

安装客户端库

C++

如需详细了解此客户端库的要求和安装依赖项,请参阅设置 C++ 开发环境

Java

如果您使用的是 Maven,请将以下代码添加到您的 pom.xml 文件中。如需详细了解 BOM,请参阅 Google Cloud Platform 库 BOM

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>com.google.cloud</groupId>
      <artifactId>libraries-bom</artifactId>
      <version>26.79.0</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
  </dependencies>
</dependencyManagement>

<dependencies>
  <dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>google-cloud-contentwarehouse</artifactId>
  </dependency>
</dependencies>

如果您使用的是 Gradle, 请将以下代码添加到您的依赖项中:

implementation 'com.google.cloud:google-cloud-contentwarehouse:0.87.0'

如果您使用的是 sbt,请将 以下代码添加到您的依赖项中:

libraryDependencies += "com.google.cloud" % "google-cloud-contentwarehouse" % "0.87.0"

如果您使用的是 Visual Studio Code 或 IntelliJ,可以通过以下 IDE 插件将客户端库添加到您的 项目中:

上述插件还提供其他功能,例如服务账号密钥管理。如需了解详情,请参阅各个插件相应的文档。

如需了解详情,请参阅设置 Java 开发环境

Node.js

npm install @google-cloud/contentwarehouse

如需了解详情,请参阅设置 Node.js 开发环境

Python

pip install --upgrade google-cloud-contentwarehouse

如需了解详情,请参阅设置 Python 开发环境

设置身份验证

为了对 Google Cloud API 的调用进行身份验证,客户端库支持应用默认凭证 (ADC);这些库会在一组指定的位置查找凭证,并使用这些凭证对发送到 API 的请求进行身份验证。借助 ADC,您可以在各种环境(例如本地开发或生产环境)中为您的应用提供凭据,而无需修改应用代码。

对于生产环境,设置 ADC 的方式取决于服务和上下文。如需了解详情,请参阅设置应用默认凭证

对于本地开发环境,您可以使用与您的 Google 账号关联的凭据设置 ADC:

  1. 安装 Google Cloud CLI。 安装完成后, 初始化 Google Cloud CLI,方法是运行以下命令:

    gcloud init

    如果您使用的是外部身份提供方 (IdP),则必须先 使用联合身份登录 gcloud CLI

  2. 如果您使用的是本地 shell,请为您的用户 账号创建本地身份验证凭证:

    gcloud auth application-default login

    如果您使用的是 Cloud Shell,则无需执行此操作。

    如果返回了身份验证错误,并且您使用的是外部身份提供方 (IdP),请确认您已 使用联合身份登录 gcloud CLI

    登录屏幕随即出现。在您登录后,您的凭据会存储在 ADC 使用的本地凭据文件中。

使用客户端库

以下示例展示了如何使用客户端库。

C++


#include "google/cloud/contentwarehouse/v1/document_client.h"
#include "google/cloud/location.h"
#include <iostream>

int main(int argc, char* argv[]) try {
  if (argc != 3) {
    std::cerr << "Usage: " << argv[0] << " project-number location-id\n";
    return 1;
  }

  auto const location = google::cloud::Location(argv[1], argv[2]);

  namespace contentwarehouse = ::google::cloud::contentwarehouse_v1;
  auto client = contentwarehouse::DocumentServiceClient(
      contentwarehouse::MakeDocumentServiceConnection());

  for (auto md : client.SearchDocuments(location.FullName())) {
    if (!md) throw std::move(md).status();
    std::cout << md->DebugString() << "\n";
  }

  return 0;
} catch (google::cloud::Status const& status) {
  std::cerr << "google::cloud::Status thrown: " << status << "\n";
  return 1;
}

Java

public class QuickStart {

  public static void main(String[] args)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String location = "your-region"; // Format is "us" or "eu".
    String userId = "your-user-id"; // Format is user:<user-id>
    quickStart(projectId, location, userId);
  }

  public static void quickStart(String projectId, String location, String userId)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    String projectNumber = getProjectNumber(projectId);

    String endpoint = "contentwarehouse.googleapis.com:443";
    if (!"us".equals(location)) {
      endpoint = String.format("%s-%s", location, endpoint);
    }
    DocumentSchemaServiceSettings documentSchemaServiceSettings = 
         DocumentSchemaServiceSettings.newBuilder().setEndpoint(endpoint).build(); 

    // Create a Schema Service client
    try (DocumentSchemaServiceClient documentSchemaServiceClient =
        DocumentSchemaServiceClient.create(documentSchemaServiceSettings)) {
      /*  The full resource name of the location, e.g.:
      projects/{project_number}/locations/{location} */
      String parent = LocationName.format(projectNumber, location);

      /* Create Document Schema with Text Type Property Definition
       * More detail on managing Document Schemas: 
       * https://cloud.google.com/document-warehouse/docs/manage-document-schemas */
      DocumentSchema documentSchema = DocumentSchema.newBuilder()
          .setDisplayName("My Test Schema")
          .setDescription("My Test Schema's Description")
          .addPropertyDefinitions(
            PropertyDefinition.newBuilder()
              .setName("test_symbol")
              .setDisplayName("Searchable text")
              .setIsSearchable(true)
              .setTextTypeOptions(TextTypeOptions.newBuilder().build())
              .build()).build();

      // Define Document Schema request
      CreateDocumentSchemaRequest createDocumentSchemaRequest =
          CreateDocumentSchemaRequest.newBuilder()
            .setParent(parent)
            .setDocumentSchema(documentSchema).build();

      // Create Document Schema
      DocumentSchema documentSchemaResponse =
          documentSchemaServiceClient.createDocumentSchema(createDocumentSchemaRequest); 


      // Create Document Service Client Settings
      DocumentServiceSettings documentServiceSettings = 
          DocumentServiceSettings.newBuilder().setEndpoint(endpoint).build();

      // Create Document Service Client and Document with relevant properties 
      try (DocumentServiceClient documentServiceClient =
          DocumentServiceClient.create(documentServiceSettings)) {
        TextArray textArray = TextArray.newBuilder().addValues("Test").build();
        Document document = Document.newBuilder()
              .setDisplayName("My Test Document")
              .setDocumentSchemaName(documentSchemaResponse.getName())
              .setPlainText("This is a sample of a document's text.")
              .addProperties(
                Property.newBuilder()
                  .setName(documentSchema.getPropertyDefinitions(0).getName())
                  .setTextValues(textArray)).build();

        // Define Request Metadata for enforcing access control
        RequestMetadata requestMetadata = RequestMetadata.newBuilder()
            .setUserInfo(
            UserInfo.newBuilder()
              .setId(userId).build()).build();

        // Define Create Document Request 
        CreateDocumentRequest createDocumentRequest = CreateDocumentRequest.newBuilder()
            .setParent(parent)
            .setDocument(document)
            .setRequestMetadata(requestMetadata)
            .build();

        // Create Document
        CreateDocumentResponse createDocumentResponse =
            documentServiceClient.createDocument(createDocumentRequest);

        System.out.println(createDocumentResponse.getDocument().getName());
        System.out.println(documentSchemaResponse.getName());
      }
    }
  }

  private static String getProjectNumber(String projectId) throws IOException { 
    try (ProjectsClient projectsClient = ProjectsClient.create()) { 
      ProjectName projectName = ProjectName.of(projectId); 
      Project project = projectsClient.getProject(projectName);
      String projectNumber = project.getName(); // Format returned is projects/xxxxxx
      return projectNumber.substring(projectNumber.lastIndexOf("/") + 1);
    } 
  }
}

Node.js

/**
 * This snippet has been automatically generated and should be regarded as a code template only.
 * It will require modifications to work.
 * It may require correct/in-range values for request initialization.
 * TODO(developer): Uncomment these variables before running the sample.
 */
/**
 *  Required. The parent, which owns this collection of document.
 *  Format: projects/{project_number}/locations/{location}.
 */
// const parent = 'abc123'
/**
 *  The maximum number of rule sets to return. The service may return
 *  fewer than this value.
 *  If unspecified, at most 50 rule sets will be returned.
 *  The maximum value is 1000; values above 1000 will be coerced to 1000.
 */
// const pageSize = 1234
/**
 *  A page token, received from a previous `ListRuleSets` call.
 *  Provide this to retrieve the subsequent page.
 *  When paginating, all other parameters provided to `ListRuleSets`
 *  must match the call that provided the page token.
 */
// const pageToken = 'abc123'

// Imports the Contentwarehouse library
const {RuleSetServiceClient} = require('@google-cloud/contentwarehouse').v1;

// Instantiates a client
const contentwarehouseClient = new RuleSetServiceClient();

async function callListRuleSets() {
  // Construct request
  const request = {
    parent,
  };

  // Run request
  const iterable = await contentwarehouseClient.listRuleSetsAsync(request);
  for await (const response of iterable) {
    console.log(response);
  }
}

callListRuleSets();

Python


from google.cloud import contentwarehouse

# TODO(developer): Uncomment these variables before running the sample.
# project_number = 'YOUR_PROJECT_NUMBER'
# location = 'YOUR_PROJECT_LOCATION' # Format is 'us' or 'eu'
# user_id = "user:xxxx@example.com" # Format is "user:xxxx@example.com"


def quickstart(project_number: str, location: str, user_id: str) -> None:
    # Create a Schema Service client
    document_schema_client = contentwarehouse.DocumentSchemaServiceClient()

    # The full resource name of the location, e.g.:
    # projects/{project_number}/locations/{location}
    parent = document_schema_client.common_location_path(
        project=project_number, location=location
    )

    # Define Schema Property of Text Type
    property_definition = contentwarehouse.PropertyDefinition(
        name="stock_symbol",  # Must be unique within a document schema (case insensitive)
        display_name="Searchable text",
        is_searchable=True,
        text_type_options=contentwarehouse.TextTypeOptions(),
    )

    # Define Document Schema Request
    create_document_schema_request = contentwarehouse.CreateDocumentSchemaRequest(
        parent=parent,
        document_schema=contentwarehouse.DocumentSchema(
            display_name="My Test Schema",
            property_definitions=[property_definition],
        ),
    )

    # Create a Document schema
    document_schema = document_schema_client.create_document_schema(
        request=create_document_schema_request
    )

    # Create a Document Service client
    document_client = contentwarehouse.DocumentServiceClient()

    # The full resource name of the location, e.g.:
    # projects/{project_number}/locations/{location}
    parent = document_client.common_location_path(
        project=project_number, location=location
    )

    # Define Document Property Value
    document_property = contentwarehouse.Property(
        name=document_schema.property_definitions[0].name,
        text_values=contentwarehouse.TextArray(values=["GOOG"]),
    )

    # Define Document
    document = contentwarehouse.Document(
        display_name="My Test Document",
        document_schema_name=document_schema.name,
        plain_text="This is a sample of a document's text.",
        properties=[document_property],
    )

    # Define Request
    create_document_request = contentwarehouse.CreateDocumentRequest(
        parent=parent,
        document=document,
        request_metadata=contentwarehouse.RequestMetadata(
            user_info=contentwarehouse.UserInfo(id=user_id)
        ),
    )

    # Create a Document for the given schema
    response = document_client.create_document(request=create_document_request)

    # Read the output
    print(f"Rule Engine Output: {response.rule_engine_output}")
    print(f"Document Created: {response.document}")

其他资源

C++

以下列表包含与 C++ 版客户端库相关的更多资源的链接:

Java

以下列表包含与 Java 版客户端库相关的更多资源的链接:

Node.js

以下列表包含与 Node.js 版客户端库相关的更多资源的链接:

Python

以下列表包含与 Python 版客户端库相关的更多资源的链接: