LRO endpoint to batch process many documents. The output is written to Cloud Storage as JSON in the [Document] format.
This method waits—the workflow execution is paused—until the operation is
complete, fails, or times out. The default timeout value is 1800 seconds (30
minutes) and can be changed to a maximum value of 31536000 seconds (one year)
for long-running operations using the connector_params field. See the
Connectors reference.
The connector uses polling to monitor the long-running operation, which might generate additional billable steps. For more information about retries and long-running operations, refer to Understand connectors.
The polling policy for the long-running operation can be configured. To set the
connector-specific parameters (connector_params), refer to
Invoke a connector call.
Arguments
| Parameters | |
|---|---|
| name | 
 Required. The resource name of Processor or ProcessorVersion. Format:  
 | 
| location | 
 Location of the HTTP endpoint:  
 | 
| body | 
 Required. 
 | 
Raised exceptions
| Exceptions | |
|---|---|
| ConnectionError | In case of a network problem (such as DNS failure or refused connection). | 
| HttpError | If the response status is >= 400 (excluding 429 and 503). | 
| TimeoutError | If a long-running operation takes longer to finish than the specified timeout limit. | 
| TypeError | If an operation or function receives an argument of the wrong type. | 
| ValueError | If an operation or function receives an argument of the right type but an inappropriate value. For example, a negative timeout. | 
| OperationError | If the long-running operation finished unsuccessfully. | 
| ResponseTypeError | If the long-running operation returned a response of the wrong type. | 
Response
If successful, the response contains an instance of GoogleLongrunningOperation.
Subworkflow snippet
Some fields might be optional or required. To identify required fields, refer to the API documentation.
YAML
- batchProcess: call: googleapis.documentai.v1beta3.projects.locations.processors.processorVersions.batchProcess args: name: ... body: documentOutputConfig: gcsOutputConfig: fieldMask: ... gcsUri: ... shardingConfig: pagesOverlap: ... pagesPerShard: ... inputConfigs: ... inputDocuments: gcsDocuments: documents: ... gcsPrefix: gcsUriPrefix: ... outputConfig: gcsDestination: ... processOptions: ocrConfig: enableNativePdfParsing: ... skipHumanReview: ... result: batchProcessResult
JSON
[ { "batchProcess": { "call": "googleapis.documentai.v1beta3.projects.locations.processors.processorVersions.batchProcess", "args": { "name": "...", "body": { "documentOutputConfig": { "gcsOutputConfig": { "fieldMask": "...", "gcsUri": "...", "shardingConfig": { "pagesOverlap": "...", "pagesPerShard": "..." } } }, "inputConfigs": "...", "inputDocuments": { "gcsDocuments": { "documents": "..." }, "gcsPrefix": { "gcsUriPrefix": "..." } }, "outputConfig": { "gcsDestination": "..." }, "processOptions": { "ocrConfig": { "enableNativePdfParsing": "..." } }, "skipHumanReview": "..." } }, "result": "batchProcessResult" } } ]