Full name: projects.locations.endpoints.predict
Perform an online inference. Use this method to run inference on Google's generative AI models as well as custom models deployed to Gemini Enterprise Agent Platform. This method supports both generative AI tasks (such as image generation, virtual try-on, text generation, and multimodal embeddings) and traditional machine learning tasks (such as classification and regression).
To run inference on a base (non-tuned) Gemini model, see endpoints.generateContent.
Endpoint
posthttps://{service-endpoint}/v1/{endpoint}:predict
Where {service-endpoint} is one of the supported service endpoints.
Path parameters
endpointstring
Required. The resource name of the publisher model or endpoint requested to serve the prediction. For Google models like Embedding or Veo, use the publisher model format. For tuned models or other models deployed to an Agent Platform endpoint, use the endpoint format.
- Publisher model format:
projects/{project}/locations/{location}/publishers/google/models/{model} - Endpoint format:
projects/{project}/locations/{location}/endpoints/{endpoint}
Request body
The request body contains data with the following structure:
Required. The instances that are the input to the prediction call. A DeployedModel may have an upper limit on the number of instances it supports per request, and when it is exceeded the prediction call errors in case of AutoML Models, or, in case of customer created Models, the behavior is as documented by that Model. The schema of each instance depends on the type of request.
- For a generative AI request to a Text Embedding model, see
TextEmbeddingPredictionInstance - For a generative AI request to a Multimodal Embedding model, see
VisionEmbeddingModelInstance - For a video generation request to a Veo model, see
VideoGenerationModelInstance - For a traditional machine learning request to a deployed custom model, the schema of any single instance is defined by the model's
instanceSchemaUri.
The parameters that govern the prediction. The schema of the parameters depends on the type of request.
- For a generative AI request to a Text Embedding model, see
TextEmbeddingPredictionParams - For a generative AI request to a Multimodal Embedding model, see
VisionEmbeddingModelParams - For a video generation request to a Veo model, see
VideoGenerationModelParams - For a traditional machine learning request to a deployed custom model, the schema of the parameters is defined by the model's
parametersSchemaUri.
labelsmap (key: string, value: string)
Optional. The user labels for Imagen billing usage only. Only Imagen supports labels. For other use cases, it will be ignored.
Response body
If successful, the response body contains an instance of PredictResponse.