There are specific image formats, image dimensions, and file sizes you can send to Cloud Vision. Use this guidance to ensure effective feature detection when using the Vision API.
File formats
The Vision API supports the following image types:
- JPEG
- PNG8
- PNG24
- GIF
- Animated GIF (first frame only)
- BMP
- WEBP
- RAW
- ICO
- TIFF
Some of these image formats are lossy (for example, JPEG). Reducing file sizes for such lossy formats might degrade image quality and Vision API accuracy.
Image dimension recommendations
For accurate image detection within the Vision API, use images that are a minimum of 640x480 pixels (about 300,000 pixels).
In practice, a standard size of 640x480 pixels works well in most cases. Image sizes larger than 640x480 pixels may not gain much in accuracy, while greatly diminishing throughput. When possible, pre-process your images to reduce their size to these minimum standards.
The following recommended sizes vary by the feature detected. For example,
FACE_DETECTION requests generally require larger image sizes because the
detected features (faces) are smaller than the image. LABEL_DETECTION
requests, on the other hand, generally evaluate an entire image.
The following table lists types of Vision API feature requests and their recommended image sizes:
| Vision API feature | Recommended size | Notes |
|---|---|---|
FACE_DETECTION |
1600x1200 | Distance between eyes is the most important. |
LANDMARK_DETECTION |
640x480 | - |
LOGO_DETECTION |
640x480 | - |
LABEL_DETECTION |
640x480 | - |
TEXT_DETECTION and DOCUMENT_TEXT_DETECTION |
1024x768 | OCR requires more resolution to detect characters. |
SAFE_SEARCH_DETECTION |
640x480 | - |
Vision API requires images large enough to distinguish important features. Sizes smaller or larger than these recommended sizes can work. However, smaller sizes can result in lower accuracy, and larger sizes can increase processing time and bandwidth usage without a proportional accuracy gain. For OCR analysis, image size must not exceed 75,000,000 pixels (length x width). If an image exceeds this limit, the Vision API resizes it; otherwise, the Vision API uses the original image.
Image and file size
Image files sent to the Vision API shouldn't exceed 20 MB. Files exceeding 20 MB generate an error. The Vision API doesn't resize files of this size.
To improve query latency, reduce your file size. However, avoid reducing image quality during this process.
The Vision API imposes a 10 MB JSON request size limit. Host larger files on Cloud Storage or on the web, rather than passing them as base64-encoded content in the JSON itself.