Supported image formats and dimensions

There are specific image formats, image dimensions, and file sizes you can send to Cloud Vision. Use this guidance to ensure effective feature detection when using the Vision API.

File formats

The Vision API supports the following image types:

  • JPEG
  • PNG8
  • PNG24
  • GIF
  • Animated GIF (first frame only)
  • BMP
  • WEBP
  • RAW
  • ICO
  • PDF
  • TIFF

Some of these image formats are lossy (for example, JPEG). Reducing file sizes for such lossy formats might degrade image quality and Vision API accuracy.

Image dimension recommendations

For accurate image detection within the Vision API, use images that are a minimum of 640x480 pixels (about 300,000 pixels).

In practice, a standard size of 640x480 pixels works well in most cases. Image sizes larger than 640x480 pixels may not gain much in accuracy, while greatly diminishing throughput. When possible, pre-process your images to reduce their size to these minimum standards.

The following recommended sizes vary by the feature detected. For example, FACE_DETECTION requests generally require larger image sizes because the detected features (faces) are smaller than the image. LABEL_DETECTION requests, on the other hand, generally evaluate an entire image.

The following table lists types of Vision API feature requests and their recommended image sizes:

Vision API feature Recommended size Notes
FACE_DETECTION 1600x1200 Distance between eyes is the most important.
LANDMARK_DETECTION 640x480 -
LOGO_DETECTION 640x480 -
LABEL_DETECTION 640x480 -
TEXT_DETECTION and DOCUMENT_TEXT_DETECTION 1024x768 OCR requires more resolution to detect characters.
SAFE_SEARCH_DETECTION 640x480 -

Vision API requires images large enough to distinguish important features. Sizes smaller or larger than these recommended sizes can work. However, smaller sizes can result in lower accuracy, and larger sizes can increase processing time and bandwidth usage without a proportional accuracy gain. For OCR analysis, image size must not exceed 75,000,000 pixels (length x width). If an image exceeds this limit, the Vision API resizes it; otherwise, the Vision API uses the original image.

Image and file size

Image files sent to the Vision API shouldn't exceed 20 MB. Files exceeding 20 MB generate an error. The Vision API doesn't resize files of this size.

To improve query latency, reduce your file size. However, avoid reducing image quality during this process.

The Vision API imposes a 10 MB JSON request size limit. Host larger files on Cloud Storage or on the web, rather than passing them as base64-encoded content in the JSON itself.