Index
- ImageAnnotator(interface)
- AnnotateFileRequest(message)
- AnnotateFileResponse(message)
- AnnotateImageRequest(message)
- AnnotateImageResponse(message)
- BatchAnnotateFilesRequest(message)
- BatchAnnotateFilesResponse(message)
- BatchAnnotateImagesRequest(message)
- BatchAnnotateImagesResponse(message)
- Block(message)
- Block.BlockType(enum)
- BoundingPoly(message)
- EntityAnnotation(message)
- Feature(message)
- Feature.Type(enum)
- Image(message)
- ImageAnnotationContext(message)
- ImageContext(message)
- InputConfig(message)
- NormalizedVertex(message)
- Page(message)
- Paragraph(message)
- Property(message)
- Symbol(message)
- TextAnnotation(message)
- TextAnnotation.DetectedBreak(message)
- TextAnnotation.DetectedBreak.BreakType(enum)
- TextAnnotation.DetectedLanguage(message)
- TextAnnotation.TextProperty(message)
- TextDetectionParams(message)
- Vertex(message)
- Word(message)
ImageAnnotator
Service that performs Google Cloud Vision API detection tasks over client images, such as face, landmark, logo, label, and text detection. The ImageAnnotator service returns detected entities from the images.
| BatchAnnotateFiles | 
|---|
| 
 Service that performs image detection and annotation for a batch of files. Now only "application/pdf", "image/tiff" and "image/gif" are supported. This service will extract at most 5 (customers can specify which 5 in AnnotateFileRequest.pages) frames (gif) or pages (pdf or tiff) from each file provided and perform detection and annotation for each image extracted. 
 | 
| BatchAnnotateImages | 
|---|
| 
 Run image detection and annotation for a batch of images. 
 | 
AnnotateFileRequest
A request to annotate one single file, e.g. a PDF, TIFF or GIF file.
| Fields | |
|---|---|
| input_config | Required. Information about the input file. | 
| features[] | Required. Requested features. | 
| image_context | Additional context that may accompany the image(s) in the file. | 
| pages[] | 
 Pages of the file to perform image annotation. Pages starts from 1, we assume the first page of the file is page 1. At most 5 pages are supported per request. Pages can be negative. Page 1 means the first page. Page 2 means the second page. Page -1 means the last page. Page -2 means the second to the last page. If the file is GIF instead of PDF or TIFF, page refers to GIF frames. If this field is empty, by default the service performs image annotation for the first 5 pages of the file. | 
AnnotateFileResponse
Response to a single file annotation request. A file may contain one or more images, which individually have their own responses.
| Fields | |
|---|---|
| input_config | Information about the file for which this response is generated. | 
| responses[] | Individual responses to images found within the file. This field will be empty if the  | 
| total_pages | 
 This field gives the total number of pages in the file. | 
| error | If set, represents the error message for the failed request. The  | 
AnnotateImageRequest
Request for performing Google Cloud Vision API tasks over a user-provided image, with user-requested features, and with context information.
| Fields | |
|---|---|
| image | The image to be processed. | 
| features[] | Requested features. | 
| image_context | Additional context that may accompany the image. | 
AnnotateImageResponse
Response to an image annotation request.
| Fields | |
|---|---|
| text_annotations[] | If present, text (OCR) detection has completed successfully. | 
| full_text_annotation | If present, text (OCR) detection or document (OCR) text detection has completed successfully. This annotation provides the structural hierarchy for the OCR detected text. | 
| error | If set, represents the error message for the operation. Note that filled-in image annotations are guaranteed to be correct, even when  | 
| context | If present, contextual information is needed to understand where this image comes from. | 
BatchAnnotateFilesRequest
A list of requests to annotate files using the BatchAnnotateFiles API.
| Fields | |
|---|---|
| requests[] | Required. The list of file annotation requests. Right now we support only one AnnotateFileRequest in BatchAnnotateFilesRequest. | 
| parent | 
 Optional. Target project and location to make a call. Format:  If no parent is specified, a region will be chosen automatically. Supported location-ids:   Example:  | 
BatchAnnotateFilesResponse
A list of file annotation responses.
| Fields | |
|---|---|
| responses[] | The list of file annotation responses, each response corresponding to each AnnotateFileRequest in BatchAnnotateFilesRequest. | 
BatchAnnotateImagesRequest
Multiple image annotation requests are batched into a single service call.
| Fields | |
|---|---|
| requests[] | Required. Individual image annotation requests for this batch. | 
| parent | 
 Optional. Target project and location to make a call. Format:  If no parent is specified, a region will be chosen automatically. Supported location-ids:   Example:  | 
BatchAnnotateImagesResponse
Response to a batch image annotation request.
| Fields | |
|---|---|
| responses[] | Individual responses to image annotation requests within the batch. | 
Block
Logical element on the page.
| Fields | |
|---|---|
| property | Additional information detected for the block. | 
| bounding_box | The bounding box for the block. The vertices are in the order of top-left, top-right, bottom-right, bottom-left. When a rotation of the bounding box is detected the rotation is represented as around the top-left corner as defined when the text is read in the 'natural' orientation. For example: 
 
 and the vertex order will still be (0, 1, 2, 3). | 
| paragraphs[] | List of paragraphs in this block (if this blocks is of type text). | 
| block_type | Detected block type (text, image etc) for this block. | 
| confidence | 
 Confidence of the OCR results on the block. Range [0, 1]. | 
BlockType
Type of a block (text, image etc) as identified by OCR.
| Enums | |
|---|---|
| UNKNOWN | Unknown block type. | 
| TEXT | Regular text block. | 
| TABLE | Table block. | 
| PICTURE | Image block. | 
| RULER | Horizontal/vertical line box. | 
| BARCODE | Barcode block. | 
BoundingPoly
A bounding polygon for the detected image annotation.
| Fields | |
|---|---|
| vertices[] | The bounding polygon vertices. | 
| normalized_vertices[] | The bounding polygon normalized vertices. | 
EntityAnnotation
Set of detected entity features.
| Fields | |
|---|---|
| mid | 
 Opaque entity ID. Some IDs may be available in Google Knowledge Graph Search API. | 
| locale | 
 The language code for the locale in which the entity textual  | 
| description | 
 Entity textual description, expressed in its  | 
| score | 
 Overall score of the result. Range [0, 1]. | 
| confidence | 
 Deprecated. Use  | 
| topicality | 
 The relevancy of the ICA (Image Content Annotation) label to the image. For example, the relevancy of "tower" is likely higher to an image containing the detected "Eiffel Tower" than to an image containing a detected distant towering building, even though the confidence that there is a tower in each image may be the same. Range [0, 1]. | 
| bounding_poly | Image region to which this entity belongs. Not produced for  | 
| properties[] | Some entities may have optional user-supplied  | 
Feature
The type of Google Cloud Vision API detection to perform, and the maximum number of results to return for that type. Multiple Feature objects can be specified in the features list.
| Fields | |
|---|---|
| type | The feature type. | 
| model | 
 Model to use for the feature. Supported values: "builtin/stable" (the default if unset) and "builtin/latest".  | 
Type
Type of Google Cloud Vision API feature to be extracted.
| Enums | |
|---|---|
| TYPE_UNSPECIFIED | Unspecified feature type. | 
| TEXT_DETECTION | Run text detection / optical character recognition (OCR). Text detection is optimized for areas of text within a larger image; if the image is a document, use DOCUMENT_TEXT_DETECTIONinstead. | 
| DOCUMENT_TEXT_DETECTION | Run dense text document OCR. Takes precedence when both DOCUMENT_TEXT_DETECTIONandTEXT_DETECTIONare present. | 
Image
Client image to perform Google Cloud Vision API tasks over.
| Fields | |
|---|---|
| content | 
 Image content, represented as a stream of bytes. Note: As with all  Currently, this field only works for BatchAnnotateImages requests. It does not work for AsyncBatchAnnotateImages requests. | 
ImageAnnotationContext
If an image was produced from a file (e.g. a PDF), this message gives information about the source of that image.
| Fields | |
|---|---|
| uri | 
 The URI of the file used to produce the image. | 
| page_number | 
 If the file was a PDF or TIFF, this field gives the page number within the file used to produce the image. | 
ImageContext
Image context and/or feature-specific parameters.
| Fields | |
|---|---|
| language_hints[] | 
 List of languages to use for TEXT_DETECTION. In most cases, an empty value yields the best results since it enables automatic language detection. For languages based on the Latin alphabet, setting  | 
| text_detection_params | Parameters for text detection and document text detection. | 
InputConfig
The desired input location and metadata.
| Fields | |
|---|---|
| content | 
 File content, represented as a stream of bytes. Note: As with all  Currently, this field only works for BatchAnnotateFiles requests. It does not work for AsyncBatchAnnotateFiles requests. | 
| mime_type | 
 The type of the file. Currently only "application/pdf", "image/tiff" and "image/gif" are supported. Wildcards are not supported. | 
NormalizedVertex
A vertex represents a 2D point in the image. NOTE: the normalized vertex coordinates are relative to the original image and range from 0 to 1.
| Fields | |
|---|---|
| x | 
 X coordinate. | 
| y | 
 Y coordinate. | 
Page
Detected page from OCR.
| Fields | |
|---|---|
| property | Additional information detected on the page. | 
| width | 
 Page width. For PDFs the unit is points. For images (including TIFFs) the unit is pixels. | 
| height | 
 Page height. For PDFs the unit is points. For images (including TIFFs) the unit is pixels. | 
| blocks[] | List of blocks of text, images etc on this page. | 
| confidence | 
 Confidence of the OCR results on the page. Range [0, 1]. | 
Paragraph
Structural unit of text representing a number of words in certain order.
| Fields | |
|---|---|
| property | Additional information detected for the paragraph. | 
| bounding_box | The bounding box for the paragraph. The vertices are in the order of top-left, top-right, bottom-right, bottom-left. When a rotation of the bounding box is detected the rotation is represented as around the top-left corner as defined when the text is read in the 'natural' orientation. For example: * when the text is horizontal it might look like: 0----1 | | 3----2 * when it's rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertex order will still be (0, 1, 2, 3). | 
| words[] | List of all words in this paragraph. | 
| confidence | 
 Confidence of the OCR results for the paragraph. Range [0, 1]. | 
Property
A Property consists of a user-supplied name/value pair.
| Fields | |
|---|---|
| name | 
 Name of the property. | 
| value | 
 Value of the property. | 
| uint64_value | 
 Value of numeric properties. | 
Symbol
A single symbol representation.
| Fields | |
|---|---|
| property | Additional information detected for the symbol. | 
| bounding_box | The bounding box for the symbol. The vertices are in the order of top-left, top-right, bottom-right, bottom-left. When a rotation of the bounding box is detected the rotation is represented as around the top-left corner as defined when the text is read in the 'natural' orientation. For example: * when the text is horizontal it might look like: 0----1 | | 3----2 * when it's rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertex order will still be (0, 1, 2, 3). | 
| text | 
 The actual UTF-8 representation of the symbol. | 
| confidence | 
 Confidence of the OCR results for the symbol. Range [0, 1]. | 
TextAnnotation
TextAnnotation contains a structured representation of OCR-extracted text. The hierarchy of an OCR-extracted text structure is like this: 
TextAnnotation -> Page -> Block -> Paragraph -> Word -> Symbol
TextAnnotation.TextProperty message definition that follows.
        | Fields | |
|---|---|
| pages[] | List of pages detected by OCR. | 
| text | 
 UTF-8 text detected on the pages. | 
DetectedBreak
Detected start or end of a structural component.
| Fields | |
|---|---|
| type | Detected break type. | 
| is_prefix | 
 True if break prepends the element. | 
BreakType
Enum to denote the type of break found. New line, space etc.
| Enums | |
|---|---|
| UNKNOWN | Unknown break label type. | 
| SPACE | Regular space. | 
| SURE_SPACE | Sure space (very wide). | 
| EOL_SURE_SPACE | Line-wrapping break. | 
| HYPHEN | End-line hyphen that is not present in text; does not co-occur with SPACE,LEADER_SPACE, orLINE_BREAK. | 
| LINE_BREAK | Line break that ends a paragraph. | 
DetectedLanguage
Detected language for a structural component.
| Fields | |
|---|---|
| language_code | 
 The BCP-47 language code, such as "en-US" or "sr-Latn". For more information, see https://www.unicode.org/reports/tr35/#Unicode_locale_identifier. | 
| confidence | 
 Confidence of detected language. Range [0, 1]. | 
TextProperty
Additional information detected on the structural component.
| Fields | |
|---|---|
| detected_languages[] | A list of detected languages together with confidence. | 
| detected_break | Detected start or end of a text segment. | 
TextDetectionParams
Parameters for text detections. This is used to control TEXT_DETECTION and DOCUMENT_TEXT_DETECTION features.
| Fields | |
|---|---|
| enable_text_detection_confidence_score | 
 By default, Cloud Vision API only includes confidence score for DOCUMENT_TEXT_DETECTION result. Set the flag to true to include confidence score for TEXT_DETECTION as well. | 
| advanced_ocr_options[] | 
 A list of advanced OCR options to fine-tune OCR behavior. | 
Vertex
A vertex represents a 2D point in the image. NOTE: the vertex coordinates are in the same scale as the original image.
| Fields | |
|---|---|
| x | 
 X coordinate. | 
| y | 
 Y coordinate. | 
Word
A word representation.
| Fields | |
|---|---|
| property | Additional information detected for the word. | 
| bounding_box | The bounding box for the word. The vertices are in the order of top-left, top-right, bottom-right, bottom-left. When a rotation of the bounding box is detected the rotation is represented as around the top-left corner as defined when the text is read in the 'natural' orientation. For example: * when the text is horizontal it might look like: 0----1 | | 3----2 * when it's rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertex order will still be (0, 1, 2, 3). | 
| symbols[] | List of symbols in the word. The order of the symbols follows the natural reading order. | 
| confidence | 
 Confidence of the OCR results for the word. Range [0, 1]. |