Audience
The goal of this tutorial is to help you develop applications using the Vision API Crop Hints feature. It assumes you are familiar with basic programming constructs and techniques. However, even if you are a beginning programmer, you should be able to follow along and run this tutorial without difficulty, then use the Vision API reference documentation to create basic applications.
This tutorial steps through a Vision API application, showing you how to make a call to the Vision API to use its Crop Hints feature.
Prerequisites
- Set up a Vision API project in the Google Cloud console.
Set up your environment for using Application Default Credentials.
Python
- Install Python.
- Install pip.
- Install the Google Cloud Client Library.
- Install the Python Imaging Library
Overview
This tutorial walks you through a basic Vision API application that uses a
Crop Hints request. You can provide the image to be processed either through
a Cloud Storage URI (Cloud Storage bucket location) or embedded in the
request. A successful Crop Hints response returns the coordinates for a
bounding box cropped around the dominant object or face in the image.
Code listing
As you read the code, we recommend that you follow along by referring to the Cloud Vision API Python reference.
A closer look
Importing libraries
We import standard libraries:
argparseto allow the application to accept input filenames as argumentsiofor file I/O
Other imports:
- The
ImageAnnotatorClientclass within thegoogle.cloud.visionlibrary for accessing the Vision API. - The
typesmodule within thegoogle.cloud.visionlibrary for constructing requests - The
ImageandImageDrawmodules from thePython Imaging Library(PIL). to draw a boundary box on the input image.
Running the application
Here, we simply parse the passed-in argument that specifies the local image filename, and pass it to a function to crop the image or draw the hint.
Authenticating to the API
Before communicating with the Vision API service, you must
authenticate your service using previously acquired credentials. Within an
application, the simplest way to obtain credentials is to use
Application Default Credentials
(ADC). By default, the client library will attempt to
obtain credentials from the GOOGLE_APPLICATION_CREDENTIALS
environment variable, which should be set to point to your service account's
JSON key file (see
Set Up a Service Account
for more information.)
Getting crop hint annotations for the image
Now that the Vision client library is authenticated, we can access the service
by calling the crop_hints method of the ImageAnnotatorClient instance.
The aspect ratio for the output is specified in an
ImageContext object; if multiple aspect ratios are passed in then multiple
crop hints will be returned, one for each aspect ratio.
The client library encapsulates the details for requests and responses to the API. See the Vision API Reference for complete information on the structure of a request.
Using the response to crop or draw the hint's bounding box
Once the operation has been completed successfully, the API response will
contain the bounding box coordinates of one or more cropHints. The
draw_hint method draws lines around the CropHints bounding box, then writes
the image to output-hint.jpg.
The crop_to_hint method crops the image using the suggested crop hint.
Running the application
To run the application, you can
download this cat.jpg file
(you may need to right-click the link),
then pass the location where you downloaded the file on your local machine
to the tutorial application (crop_hints.py).
Here is the Python command, followed by console output, which displays the
JSON cropHintsAnnotation response. This response includes the coordinates of
the cropHints bounding box. We requested a crop area with a 1.77
width-to-height aspect ratio, and the returned top-left, bottom-right
x,y coordinates of the crop rectangle are 0,336, 1100,967.
python crop_hints.py cat.jpeg crop
{
"responses": [
{
"cropHintsAnnotation": {
"cropHints": [
{
"boundingPoly": {
"vertices": [
{
"y": 336
},
{
"x": 1100,
"y": 336
},
{
"x": 1100,
"y": 967
},
{
"y": 967
}
]
},
"confidence": 0.79999995,
"importanceFraction": 0.69
}
]
}
}
]
}
And here is the cropped image.
Congratulations! You've run the Cloud Vision Crop Hints API to return the optimized bounding box coordinates around the dominant object detected in the image!