设置政策

本页面简要介绍了文件夹,并说明了如何使用文件夹管理文档。

政策引擎和规则

在 Document Warehouse 中,Policy Engine 可让用户在创建或更新文档时定义和执行针对文档的常见操作(例如验证或更新)。

规则和规则集

从广义上讲,规则是指用户定义的配置,用于指定以下内容:

  • 触发规则检查的条件,
  • 评估哪个条件,以及
  • 满足条件时运行哪些操作。

除了这些规范之外,规则还包含说明、来源、目标和触发条件方面的信息。

规则的逻辑集合称为 RuleSet。例如,对同一架构进行操作的规则可以分组到单个 RuleSet 中。客户可以定义多个规则集。

规则可用于在创建或更新文档时自动触发预定义的操作。

规则包含以下三项主要内容:

  • TriggerType:应启动规则检查的事件。支持的触发类型为“创建”和“更新”。
  • 规则条件:在检测到特定触发器类型后评估的条件。可以使用通用表达式语言 (CEL) 来表达条件。每个条件的计算结果都应为布尔值。
  • 操作:满足规则时执行的一组步骤。当规则条件的评估结果为 true 时,系统会执行规则中配置的相应操作。以下是 Document Warehouse 中实现的特定操作的高级详细信息:
    • 数据验证操作:可在创建或更新文档期间验证文档中特定字段的操作。
    • 数据更新操作:在创建或更新文档期间,用于更新文档中特定字段的操作。当规则条件满足时,系统会运行此类更新。
    • 删除文档操作:一种操作,可在文档更新期间,当某些字段满足使用规则条件定义的删除条件时,删除文档。
    • 文件夹包含操作:自动将新文档(或更新后的文档)添加到特定文件夹下的操作。此类文件夹可以直接使用其名称进行指定。
    • 从文件夹中移除操作:当满足规则级条件时,自动从指定文件夹中移除新文档的操作。
    • 访问权限控制操作:允许在创建文档期间更新访问权限控制列表(群组和用户绑定)的操作。当规则条件满足时,系统会运行此类更新。
    • 发布操作:当满足规则级条件时,将特定消息发布到用户的 Pub/Sub 渠道的操作。

管理规则集

Document Warehouse 提供了用于管理 RuleSet 的 API(创建、获取、更新、删除、列出)。本部分提供了有关如何为规则配置不同类型的示例。

创建规则集

如需创建规则集,请执行以下操作:

REST

请求:

# Create a RuleSet for data validation.
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules": [
    {
      "trigger_type": "ON_CREATE",
      "condition": "documentType == \'W9\' && STATE ==\'CA\'",
      "actions": {
        "data_validation": {
          "conditions": {
            "NAME": "NAME != \'\'",
            "FILING_COST": "FILING_COST > 10.0"
          }
        }
      },
      "enabled": true
    }
  ],
  "description": "W9: Basic validation check rules."
}'

响应

{
  "description": "W9: Basic validation check rules.",
  "name": "RULE_SET_NAME",
  "rules": [
    {
      "actions": [
        {
          "actionId": "de0e6b84-106b-44ba-b1c4-0b3ad6ddc719",
          "dataValidation": {
            "conditions": {
              "FILING_COST": "FILING_COST > 10.0",
              "NAME": "NAME != ''"
            }
          }
        }
      ],
      "condition": "documentType == 'W9' && STATE =='CA'",
      "enabled": true,
      "triggerType": "ON_CREATE"
    }
  ]
}

Python

如需了解详情,请参阅 Document AI Warehouse Python API 参考文档

如需向 Document AI Warehouse 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证


from google.cloud import contentwarehouse

# TODO(developer): Uncomment these variables before running the sample.
# project_number = "YOUR_PROJECT_NUMBER"
# location = "us" # Format is 'us' or 'eu'


def create_rule_set(project_number: str, location: str) -> None:
    # Create a client
    client = contentwarehouse.RuleSetServiceClient()

    # The full resource name of the location, e.g.:
    # projects/{project_number}/locations/{location}
    parent = client.common_location_path(project=project_number, location=location)

    actions = contentwarehouse.Action(
        delete_document_action=contentwarehouse.DeleteDocumentAction(
            enable_hard_delete=True
        )
    )

    rules = contentwarehouse.Rule(
        trigger_type="ON_CREATE",
        condition="documentType == 'W9' && STATE =='CA'",
        actions=[actions],
    )

    rule_set = contentwarehouse.RuleSet(
        description="W9: Basic validation check rules.",
        source="My Organization",
        rules=[rules],
    )

    # Initialize request argument(s)
    request = contentwarehouse.CreateRuleSetRequest(parent=parent, rule_set=rule_set)

    # Make the request
    response = client.create_rule_set(request=request)

    # Handle the response
    print(f"Rule Set Created: {response}")

    # Initialize request argument(s)
    request = contentwarehouse.ListRuleSetsRequest(
        parent=parent,
    )

    # Make the request
    page_result = client.list_rule_sets(request=request)

    # Handle the response
    for response in page_result:
        print(f"Rule Sets: {response}")

Java

如需了解详情,请参阅 Document AI Warehouse Java API 参考文档

如需向 Document AI Warehouse 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

import com.google.cloud.contentwarehouse.v1.Action;
import com.google.cloud.contentwarehouse.v1.ActionOrBuilder;
import com.google.cloud.contentwarehouse.v1.CreateRuleSetRequest;
import com.google.cloud.contentwarehouse.v1.CreateRuleSetRequestOrBuilder;
import com.google.cloud.contentwarehouse.v1.DeleteDocumentAction;
import com.google.cloud.contentwarehouse.v1.DeleteDocumentActionOrBuilder;
import com.google.cloud.contentwarehouse.v1.ListRuleSetsRequest;
import com.google.cloud.contentwarehouse.v1.ListRuleSetsRequestOrBuilder;
import com.google.cloud.contentwarehouse.v1.LocationName;
import com.google.cloud.contentwarehouse.v1.Rule;
import com.google.cloud.contentwarehouse.v1.Rule.TriggerType;
import com.google.cloud.contentwarehouse.v1.RuleOrBuilder;
import com.google.cloud.contentwarehouse.v1.RuleSet;
import com.google.cloud.contentwarehouse.v1.RuleSetOrBuilder;
import com.google.cloud.contentwarehouse.v1.RuleSetServiceClient;
import com.google.cloud.contentwarehouse.v1.RuleSetServiceClient.ListRuleSetsPagedResponse;
import com.google.cloud.contentwarehouse.v1.RuleSetServiceSettings;
import com.google.cloud.resourcemanager.v3.Project;
import com.google.cloud.resourcemanager.v3.ProjectName;
import com.google.cloud.resourcemanager.v3.ProjectsClient;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;


public class CreateRuleSet {

  public static void createRuleSet() throws IOException, 
        InterruptedException, ExecutionException, TimeoutException { 
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String location = "your-region"; // Format is "us" or "eu".
    createRuleSet(projectId, location);
  }

  public static void createRuleSet(String projectId, String location)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    String projectNumber = getProjectNumber(projectId);

    String endpoint = "contentwarehouse.googleapis.com:443";
    if (!"us".equals(location)) {
      endpoint = String.format("%s-%s", location, endpoint);
    }
    RuleSetServiceSettings ruleSetServiceSettings =
        RuleSetServiceSettings.newBuilder().setEndpoint(endpoint).build();

    // Create a Rule Set Service Client 
    try (RuleSetServiceClient ruleSetServiceClient = 
        RuleSetServiceClient.create(ruleSetServiceSettings)) {
      /*  The full resource name of the location, e.g.:
      projects/{project_number}/locations/{location} */
      String parent = LocationName.format(projectNumber, location); 

      // Create a Delete Document Action to be added to the Rule Set 
      DeleteDocumentActionOrBuilder deleteDocumentAction = 
          DeleteDocumentAction.newBuilder().setEnableHardDelete(true).build();

      // Add Delete Document Action to Action Object 
      ActionOrBuilder action = Action.newBuilder()
            .setDeleteDocumentAction((DeleteDocumentAction) deleteDocumentAction).build();

      // Create rule to add to rule set 
      RuleOrBuilder rule = Rule.newBuilder()
          .setTriggerType(TriggerType.ON_CREATE)
          .setCondition("documentType == 'W9' && STATE =='CA' ")
          .addActions(0, (Action) action).build();

      // Create rule set and add rule to it
      RuleSetOrBuilder ruleSetOrBuilder = RuleSet.newBuilder()
          .setDescription("W9: Basic validation check rules.")
          .setSource("My Organization")
          .addRules((Rule) rule).build();

      // Create and prepare rule set request to client
      CreateRuleSetRequestOrBuilder createRuleSetRequest = 
          CreateRuleSetRequest.newBuilder()
              .setParent(parent)
              .setRuleSet((RuleSet) ruleSetOrBuilder).build();

      RuleSet response = ruleSetServiceClient.createRuleSet(
          (CreateRuleSetRequest) createRuleSetRequest);

      System.out.println("Rule set created: " + response.toString());

      ListRuleSetsRequestOrBuilder listRuleSetsRequest = 
          ListRuleSetsRequest.newBuilder()
              .setParent(parent).build();

      ListRuleSetsPagedResponse listRuleSetsPagedResponse = 
          ruleSetServiceClient.listRuleSets((ListRuleSetsRequest) listRuleSetsRequest);

      listRuleSetsPagedResponse.iterateAll().forEach(
          (ruleSet -> System.out.print(ruleSet))
      );
    }
  }

  private static String getProjectNumber(String projectId) throws IOException { 
    try (ProjectsClient projectsClient = ProjectsClient.create()) { 
      ProjectName projectName = ProjectName.of(projectId); 
      Project project = projectsClient.getProject(projectName);
      String projectNumber = project.getName(); // Format returned is projects/xxxxxx
      return projectNumber.substring(projectNumber.lastIndexOf("/") + 1);
    } 
  }
}

列出规则集

如需列出项目下的规则集,请执行以下操作:

REST

请求:

# List all rule-sets for a project.
curl -X GET -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets

响应

{
  "ruleSets": [
    {
      "description": "W9: Basic validation check rules.",
      "rules": [
        {
          "triggerType": "ON_CREATE",
          "condition": "documentType == 'W9' && STATE =='CA'",
          "actions": [
            {
              "actionId": "fcf79ae8-9a1f-4462-9262-eb2e7161350c",
              "dataValidation": {
                "conditions": {
                  "NAME": "NAME != ''",
                  "FILING_COST": "FILING_COST > 10.0"
                }
              }
            }
          ],
          "enabled": true
        }
      ],
      "name": "RULE_SET_NAME"
    }
  ]
}

获取规则集

如需使用规则集名称获取规则集,请执行以下操作:

REST

请求:

# Get a rule-set using rule-set ID.
curl -X GET -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets/RULE_SET

响应

{
  "description": "W9: Basic validation check rules.",
  "rules": [
    {
      "triggerType": "ON_CREATE",
      "condition": "documentType == 'W9' && STATE =='CA'",
      "actions": [
        {
          "actionId": "7559346b-ec9f-4143-ab1c-1912f5588807",
          "dataValidation": {
            "conditions": {
              "NAME": "NAME != ''",
              "FILING_COST": "FILING_COST > 10.0"
            }
          }
        }
      ],
      "enabled": true
    }
  ],
  "name": "RULE_SET_NAME"
}

删除规则集

如需使用规则集名称删除规则集,请执行以下操作:

REST

请求:

# Get a rule-set using rule-set ID.
curl -X DELETE -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets/RULE_SET

规则操作

本部分将介绍规则表达式和每个规则操作示例。

示例条件

条件是指使用通用表达式语言指定的表达式。

示例:

  • 字符串字段表达式
    • STATE == \'CA\'。检查 STATE 字段的值是否等于 CA
    • NAME != \'\'。检查 NAME 字段的值是否不为空。
  • 数字字段表达式
    • FILING_COST > 10.0。检查 FILING_COST 字段(定义为浮点数)的值是否大于 10.0

如何检查文档是否属于特定架构

如需引用特定架构类型,请使用特殊字段名称 documentType(这是一个保留字)。它会根据 DocumentSchema 中的 DisplayName 字段进行评估。

示例:

  • documentType == \'W9\'

上述条件会检查文档的架构(使用关键字 documentType)是否具有 W9 的显示名称。

如何引用旧/现有文档属性值和新文档属性值

为了支持包含现有属性和新提供属性的条件,请使用以下两个前缀和点运算符来访问属性的特定版本:

  • OLD_ 来引用现有文档属性。
  • NEW_,以便在请求中引用新的文档属性。

示例:

  • OLD_.state == \'TX\' && NEW_.state == \'CA\' 检查状态属性的现有值是否为 TX,以及给定的新值是否为 CA

日期字段处理

对于 DriverLicense 文档,如果 EXPIRATION_DATE 早于某个日期

  • 更新(或添加新值,如果不存在)EXPIRATION_STATUS(枚举字段),使其值等于 EXPIRING_BEFORE_CLOSING_DATE

如需添加日期值,请使用时间戳函数,如以下示例所示。

REST

请求:

# Check if document expires before a date and update the status field
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules":[
    {
      "trigger_type": "ON_CREATE",
      "description": "Expiration date check rule",
      "condition": "documentType==\'DriverLicense\' && EXPIRATION_DATE  < timestamp(\'2021-08-01T00:00:00Z\')",
      "actions": {
        "data_update": {
          "entries": {
            "EXPIRATION_STATUS": "EXPIRING_BEFORE_CLOSING_DATE"
          }
        }
      }
    }
  ]
}'

数据验证规则

验证 STATE(文本字段)加利福尼亚州的 W9 文档:

  • 检查 NAME(文本字段)是否不为空。
  • 检查 FILING_COST(浮点数字段)是否大于 10.0

REST

请求:

# Rules for data validation.
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules": [
    {
      "trigger_type": "ON_CREATE",
      "condition": "documentType == \'W9\' && STATE ==\'CA\'",
      "actions": {
        "data_validation": {
          "conditions": {
            "NAME": "NAME != \'\'",
            "FILING_COST": "FILING_COST > 10.0"
          }
        }
      },
      "enabled": true
    }
  ],
  "description": "W9: Basic validation check rules."
}'

数据更新规则

对于 W9 文档,如果 BUSINESS_NAME 字段为 Google:

  • 更新(或添加新的,如果不存在)一个等于 1600 Amphitheatre PkwyAddress 字段。
  • 更新(或添加新的,如果不存在)一个等于 77666666EIN 字段。

REST

请求:

# Rule for data update.
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules":[
    {
      "description": "W9: Rule to update address data and EIN.",
      "trigger_type": "ON_CREATE",
      "condition": "documentType==\'W9\' && BUSINESS_NAME == \'Google\'",
      "actions": {
        "data_update": {
          "entries": {
            "Address": "1600 Amphitheatre Pkwy",
            "EIN": "776666666"
          }
        }
      }
    }
  ]
}'

文档删除规则

在更新 W9 文档时,如果 BUSINESS_NAME 字段更改为 Google,则删除该文档。

REST

请求:

# Rule for deleting the document
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules": [
    {
      "description": "W9: Rule to delete the document during update.",
      "trigger_type": "ON_UPDATE",
      "condition": "documentType == \'W9\' && BUSINESS_NAME == \'Google\'",
      "actions": {
        "delete_document_action": {
          "enable_hard_delete": true
        }
      }
    }
  ]
}'

访问权限控制规则

在更新 W9 文档时,如果 BUSINESS_NAME 字段为 Google,则更新控制文档访问权限的政策绑定

添加新绑定

当文档满足规则条件时:

  • user:a@example.comgroup:xxx@example.com 添加了编辑者角色
  • user:b@example.comgroup:yyy@example.com 添加了 Viewer 角色

REST

请求:

# Rule for adding new policy binding while creating the document.
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules": [
    {
      "description": "W9: Rule to add new policy binding."
      "trigger_type": "ON_CREATE",
      "condition": "documentType == \'aca13aa9-6d0d-4b6b-a1eb-315dcb876bd1\' && BUSINESS_NAME == \'Google\'",
      "actions": {
        "access_control": {
          "operation_type": "ADD_POLICY_BINDING",
          "policy": {
            "bindings": [
              {
                "role": "roles/contentwarehouse.documentEditor",
                "members": ["user:a@example.com", "group:xxx@example.com"]
              },
              {
                "role": "roles/contentwarehouse.documentViewer",
                "members": ["user:b@example.com", "group:yyy@example.com"]
              }
            ]
          }
        }
      }
    }
  ]
}'

替换现有绑定

当文档满足规则条件时,替换现有绑定,使其仅包含 user:a@example.comgroup:xxx@example.com 的“编辑者”角色。

REST

请求:

# Rule for replacing existing policy bindings with newly given bindings.
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules": [
    {
      "description": "W9: Rule to replace policy binding."
      "trigger_type": "ON_CREATE",
      "condition": "documentType == \'a9e37d07-9cfa-4b4d-b372-53162e3b8bd9\' && BUSINESS_NAME == \'Google\'",
      "actions": {
        "access_control": {
          "operation_type": "REPLACE_POLICY_BINDING",
          "policy": {
            "bindings": [
              {
                "role": "roles/contentwarehouse.documentEditor",
                "members": ["user:a@example.com", "group:xxx@example.com"]
              }
            ]
          }
        }
      }
    }
  ]
}'

添加到文件夹规则

创建或更新文件夹时,可以将其添加到预定义的静态文件夹或符合特定搜索条件的文件夹下。

配置静态文件夹

创建新的 DriverLicense 后,将其添加到已创建的文件夹下。

REST

请求:

curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules": [
    {
      "trigger_type": "ON_CREATE",
      "condition": "documentType == \'DriverLicense\'",
      "actions": {
        "add_to_folder": {
          "folders": ["projects/821411934445/locations/us/documents/445en119hqp70"]
        }
      }
    }
  ]
}'

发布到 Pub/Sub

当文档被创建或更新,或者链接被创建或删除时,您可以向 Pub/Sub 渠道推送通知消息。

使用步骤

  • 在客户项目中创建 Pub/Sub 主题。
  • 使用以下请求创建规则以触发发布 Pub/Sub 操作。(请参阅以下示例。)
  • 调用 Document AI Warehouse API。
  • 验证消息是否已发布到 Pub/Sub 渠道。

规则示例

当在文件夹下添加文档(调用 CreateLink API)时,可以使用以下规则向 Pub/Sub 主题发送通知消息。

REST

请求:

curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules": [
    {
      "trigger_type": "ON_CREATE_LINK",
      "condition": "documentType == \'DriverLicenseFolder\'",
      "actions": {
        "publish_to_pub_sub": {
          "topic_id": "<topic_name>"
          "messages": "Added document under a folder."
        }
      }
    }
  ]
}'

规则详细信息

  • 此操作支持以下触发器类型:

    • ON_CRATE:创建新文档时。
    • ON_UPDATE:证件更新时间。
    • ON_CRATE_LINK:创建新链接时。
    • ON_DELETE_LINK:当链接被删除时。
  • 对于“创建文档”和“更新文档”触发器,条件可以包含正在创建或更新的文档的属性。

  • 对于“创建链接”和“删除链接”触发器,条件只能包含添加或移除文档的文件夹文档的属性。

  • messages 字段可用于向 Pub/Sub 渠道发送消息列表。请注意,除了这些消息之外,默认情况下还会发布以下字段:

    • 架构名称、文档名称、触发器类型、规则集名称、规则 ID、操作 ID
    • 对于“创建链接”和“删除链接”触发器,通知会包含正在添加或删除的相关链接信息。