Resource: Document
Document captures all raw metadata information of items to be recommended or searched.
| JSON representation |
|---|
{ "name": string, "id": string, "schemaId": string, "content": { object ( |
| Fields | |
|---|---|
name |
Immutable. The full resource name of the document. Format: This field must be a UTF-8 encoded string with a length limit of 1024 characters. |
id |
Immutable. The identifier of the document. Id should conform to RFC-1034 standard with a length limit of 128 characters. |
schemaId |
The identifier of the schema located in the same data store. |
content |
The unstructured data linked to this document. Content can only be set and must be set if this document is under a |
parentDocumentId |
The identifier of the parent document. Currently supports at most two level document hierarchy. Id should conform to RFC-1034 standard with a length limit of 63 characters. |
derivedStructData |
Output only. This field is OUTPUT_ONLY. It contains derived data that are not in the original input document. |
aclInfo |
Access control information for the document. |
indexTime |
Output only. The time when the document was last indexed. If this field is populated, it means the document has been indexed. While documents typically become searchable within seconds of indexing, it can sometimes take up to a few hours. If this field is not populated, it means the document has never been indexed. Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: |
indexStatus |
Output only. The index status of the document.
|
Union field data. Data representation. One of struct_data or json_data should be provided otherwise an INVALID_ARGUMENT error is thrown. data can be only one of the following: |
|
structData |
The structured JSON data for the document. It should conform to the registered |
jsonData |
The JSON string representation of the document. It should conform to the registered |
Content
Unstructured data linked to this document.
| JSON representation |
|---|
{ "mimeType": string, // Union field |
| Fields | |
|---|---|
mimeType |
The MIME type of the content. Supported types:
The following types are supported only if layout parser is enabled in the data store:
See https://www.iana.org/assignments/media-types/media-types.xhtml. |
Union field content. The content of the unstructured document. content can be only one of the following: |
|
rawBytes |
The content represented as a stream of bytes. The maximum length is 1,000,000 bytes (1 MB / ~0.95 MiB). Note: As with all A base64-encoded string. |
uri |
The URI of the content. Only Cloud Storage URIs (e.g. |
AclInfo
ACL Information of the Document.
| JSON representation |
|---|
{
"readers": [
{
object ( |
| Fields | |
|---|---|
readers[] |
Readers of the document. |
AccessRestriction
AclRestriction to model complex inheritance restrictions.
Example: Modeling a "Both Permit" inheritance, where to access a child document, user needs to have access to parent document.
Document Hierarchy - Space_S --> Page_P.
Readers: Space_S: group_1, user_1 Page_P: group_2, group_3, user_2
Space_S ACL Restriction - { "aclInfo": { "readers": [ { "principals": [ { "groupId": "group_1" }, { "userId": "user_1" } ] } ] } }
Page_P ACL Restriction. { "aclInfo": { "readers": [ { "principals": [ { "groupId": "group_2" }, { "groupId": "group_3" }, { "userId": "user_2" } ], }, { "principals": [ { "groupId": "group_1" }, { "userId": "user_1" } ], } ] } }
| JSON representation |
|---|
{
"principals": [
{
object ( |
| Fields | |
|---|---|
principals[] |
List of principals. |
idpWide |
All users within the Identity Provider. |
Principal
Principal identifier of a user or a group.
| JSON representation |
|---|
{ // Union field |
| Fields | |
|---|---|
Union field principal. Union field principal. Principal can be a user or a group. principal can be only one of the following: |
|
userId |
User identifier. For Google Workspace user account, userId should be the google workspace user email. For non-google identity provider user account, userId is the mapped user identifier configured during the workforcepool config. |
groupId |
Group identifier. For Google Workspace user account, groupId should be the google workspace group email. For non-google identity provider user account, groupId is the mapped group identifier configured during the workforcepool config. |
externalEntityId |
For 3P application identities which are not present in the customer identity provider. |
IndexStatus
Index status of the document.
| JSON representation |
|---|
{
"indexTime": string,
"errorSamples": [
{
object ( |
| Fields | |
|---|---|
indexTime |
The time when the document was indexed. If this field is populated, it means the document has been indexed. While documents typically become searchable within seconds of indexing, it can sometimes take up to a few hours. Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: |
errorSamples[] |
A sample of errors encountered while indexing the document. If this field is populated, the document is not indexed due to errors. |
pendingMessage |
Immutable. The message indicates the document index is in progress. If this field is populated, the document index is pending. |
Methods |
|
|---|---|
|
Creates a Document. |
|
Deletes a Document. |
|
Gets a Document. |
|
Gets the parsed layout information for a Document. |
|
Bulk import of multiple Documents. |
|
Gets a list of Documents. |
|
Updates a Document. |
|
Permanently deletes all selected Documents in a branch. |