Index
ClassificationPredictionResult(message)ImageObjectDetectionPredictionResult(message)ImageSegmentationPredictionResult(message)TabularClassificationPredictionResult(message)TabularRegressionPredictionResult(message)TextEmbedding(message)TextEmbedding.Statistics(message)TextEmbeddingPredictionResult(message)TextExtractionPredictionResult(message)TextSentimentPredictionResult(message)TftFeatureImportance(message)TimeSeriesForecastingPredictionResult(message)VideoActionRecognitionPredictionResult(message)VideoClassificationPredictionResult(message)VideoGenerationModelResult(message)VideoObjectTrackingPredictionResult(message)VideoObjectTrackingPredictionResult.Frame(message)VisionEmbeddingModelResult(message)VisionEmbeddingModelResult.VideoEmbedding(message)
ClassificationPredictionResult
Prediction output format for Image and Text Classification.
| Fields | |
|---|---|
ids[] |
The resource IDs of the AnnotationSpecs that had been identified. |
display_names[] |
The display names of the AnnotationSpecs that had been identified, order matches the IDs. |
confidences[] |
The Model's confidences in correctness of the predicted IDs, higher value means higher confidence. Order matches the Ids. |
ImageObjectDetectionPredictionResult
Prediction output format for Image Object Detection.
| Fields | |
|---|---|
ids[] |
The resource IDs of the AnnotationSpecs that had been identified, ordered by the confidence score descendingly. |
display_names[] |
The display names of the AnnotationSpecs that had been identified, order matches the IDs. |
confidences[] |
The Model's confidences in correctness of the predicted IDs, higher value means higher confidence. Order matches the Ids. |
bboxes[] |
Bounding boxes, i.e. the rectangles over the image, that pinpoint the found AnnotationSpecs. Given in order that matches the IDs. Each bounding box is an array of 4 numbers |
ImageSegmentationPredictionResult
Prediction output format for Image Segmentation.
| Fields | |
|---|---|
category_mask |
A PNG image where each pixel in the mask represents the category in which the pixel in the original image was predicted to belong to. The size of this image will be the same as the original image. The mapping between the AnntoationSpec and the color can be found in model's metadata. The model will choose the most likely category and if none of the categories reach the confidence threshold, the pixel will be marked as background. |
confidence_mask |
A one channel image which is encoded as an 8bit lossless PNG. The size of the image will be the same as the original image. For a specific pixel, darker color means less confidence in correctness of the cateogry in the categoryMask for the corresponding pixel. Black means no confidence and white means complete confidence. |
TabularClassificationPredictionResult
Prediction output format for Tabular Classification.
| Fields | |
|---|---|
classes[] |
The name of the classes being classified, contains all possible values of the target column. |
scores[] |
The model's confidence in each class being correct, higher value means higher confidence. The N-th score corresponds to the N-th class in classes. |
TabularRegressionPredictionResult
Prediction output format for Tabular Regression.
| Fields | |
|---|---|
value |
The regression value. |
lower_bound |
The lower bound of the prediction interval. |
upper_bound |
The upper bound of the prediction interval. |
quantile_values[] |
Quantile values. |
quantile_predictions[] |
Quantile predictions, in 1-1 correspondence with quantile_values. |
TextEmbedding
An embedding is a vector (list) of floating-point numbers that represents the semantic meaning of text. Embeddings can be used to compare text for similarity, classify text, or cluster text. Text with similar meaning will have similar embedding vectors.
| Fields | |
|---|---|
values[] |
The embedding vector. The size of the vector is fixed and determined by the model used for embedding generation. |
statistics |
Statistics about the input text. |
Statistics
Statistics about the input text.
| Fields | |
|---|---|
token_count |
The number of tokens in the input text. |
truncated |
Whether the input text was truncated. If true, the embedding was generated from a truncated version of the input text. This can happen if the input text was longer than the model's input token limit. |
TextEmbeddingPredictionResult
Prediction output format for Text Embedding. LINT.IfChange Represents the prediction result for a text embedding request.
| Fields | |
|---|---|
embeddings |
The embedding generated from the input text. |
TextExtractionPredictionResult
Prediction output format for Text Extraction.
| Fields | |
|---|---|
ids[] |
The resource IDs of the AnnotationSpecs that had been identified, ordered by the confidence score descendingly. |
display_names[] |
The display names of the AnnotationSpecs that had been identified, order matches the IDs. |
text_segment_start_offsets[] |
The start offsets, inclusive, of the text segment in which the AnnotationSpec has been identified. Expressed as a zero-based number of characters as measured from the start of the text snippet. |
text_segment_end_offsets[] |
The end offsets, inclusive, of the text segment in which the AnnotationSpec has been identified. Expressed as a zero-based number of characters as measured from the start of the text snippet. |
confidences[] |
The Model's confidences in correctness of the predicted IDs, higher value means higher confidence. Order matches the Ids. |
TextSentimentPredictionResult
Prediction output format for Text Sentiment
| Fields | |
|---|---|
sentiment |
The integer sentiment labels between 0 (inclusive) and sentimentMax label (inclusive), while 0 maps to the least positive sentiment and sentimentMax maps to the most positive one. The higher the score is, the more positive the sentiment in the text snippet is. Note: sentimentMax is an integer value between 1 (inclusive) and 10 (inclusive). |
TftFeatureImportance
| Fields | |
|---|---|
context_weights[] |
TFT feature importance values. Each pair for {context/horizon/attribute} should have the same shape since the weight corresponds to the column names. |
context_columns[] |
|
horizon_weights[] |
|
horizon_columns[] |
|
attribute_weights[] |
|
attribute_columns[] |
|
TimeSeriesForecastingPredictionResult
Prediction output format for Time Series Forecasting.
| Fields | |
|---|---|
value |
The regression value. |
quantile_values[] |
Quantile values. |
quantile_predictions[] |
Quantile predictions, in 1-1 correspondence with quantile_values. |
tft_feature_importance |
Only use these if TFt is enabled. |
VideoActionRecognitionPredictionResult
Prediction output format for Video Action Recognition.
| Fields | |
|---|---|
id |
The resource ID of the AnnotationSpec that had been identified. |
display_name |
The display name of the AnnotationSpec that had been identified. |
time_segment_start |
The beginning, inclusive, of the video's time segment in which the AnnotationSpec has been identified. Expressed as a number of seconds as measured from the start of the video, with fractions up to a microsecond precision, and with "s" appended at the end. |
time_segment_end |
The end, exclusive, of the video's time segment in which the AnnotationSpec has been identified. Expressed as a number of seconds as measured from the start of the video, with fractions up to a microsecond precision, and with "s" appended at the end. |
confidence |
The Model's confidence in correction of this prediction, higher value means higher confidence. |
VideoClassificationPredictionResult
Prediction output format for Video Classification.
| Fields | |
|---|---|
id |
The resource ID of the AnnotationSpec that had been identified. |
display_name |
The display name of the AnnotationSpec that had been identified. |
type |
The type of the prediction. The requested types can be configured via parameters. This will be one of - segment-classification - shot-classification - one-sec-interval-classification |
time_segment_start |
The beginning, inclusive, of the video's time segment in which the AnnotationSpec has been identified. Expressed as a number of seconds as measured from the start of the video, with fractions up to a microsecond precision, and with "s" appended at the end. Note that for 'segment-classification' prediction type, this equals the original 'timeSegmentStart' from the input instance, for other types it is the start of a shot or a 1 second interval respectively. |
time_segment_end |
The end, exclusive, of the video's time segment in which the AnnotationSpec has been identified. Expressed as a number of seconds as measured from the start of the video, with fractions up to a microsecond precision, and with "s" appended at the end. Note that for 'segment-classification' prediction type, this equals the original 'timeSegmentEnd' from the input instance, for other types it is the end of a shot or a 1 second interval respectively. |
confidence |
The Model's confidence in correction of this prediction, higher value means higher confidence. |
VideoGenerationModelResult
Prediction result from a video generation model. When you request a prediction from a video generation model, the model generates videos based on your input and returns URIs to these videos in Google Cloud Storage.
| Fields | |
|---|---|
gcs_uris[] |
A list of Google Cloud Storage URIs for generated videos. For each input instance in your prediction request, the model may generate one or more videos. This field provides the Google Cloud Storage URIs for each of these videos. |
VideoObjectTrackingPredictionResult
Prediction output format for Video Object Tracking.
| Fields | |
|---|---|
id |
The resource ID of the AnnotationSpec that had been identified. |
display_name |
The display name of the AnnotationSpec that had been identified. |
time_segment_start |
The beginning, inclusive, of the video's time segment in which the object instance has been detected. Expressed as a number of seconds as measured from the start of the video, with fractions up to a microsecond precision, and with "s" appended at the end. |
time_segment_end |
The end, inclusive, of the video's time segment in which the object instance has been detected. Expressed as a number of seconds as measured from the start of the video, with fractions up to a microsecond precision, and with "s" appended at the end. |
confidence |
The Model's confidence in correction of this prediction, higher value means higher confidence. |
frames[] |
All of the frames of the video in which a single object instance has been detected. The bounding boxes in the frames identify the same object. |
Frame
The fields xMin, xMax, yMin, and yMax refer to a bounding box, i.e. the rectangle over the video frame pinpointing the found AnnotationSpec. The coordinates are relative to the frame size, and the point 0,0 is in the top left of the frame.
| Fields | |
|---|---|
time_offset |
A time (frame) of a video in which the object has been detected. Expressed as a number of seconds as measured from the start of the video, with fractions up to a microsecond precision, and with "s" appended at the end. |
x_min |
The leftmost coordinate of the bounding box. |
x_max |
The rightmost coordinate of the bounding box. |
y_min |
The topmost coordinate of the bounding box. |
y_max |
The bottommost coordinate of the bounding box. |
VisionEmbeddingModelResult
The prediction result for a large vision model embedding request. An embedding is a vectorized representation of data such as image, text or video. The embeddings produced by this model can be used for tasks such as image retrieval, similarity comparison, and classification. The embedding vectors have 1024 dimensions.
| Fields | |
|---|---|
image_embedding |
The embedding generated from the input image. This field is populated if the prediction request contained an image. |
text_embedding |
The embedding generated from the input text. This field is populated if the prediction request contained text. |
video_embeddings[] |
The embeddings generated from the input video. This field is populated if the prediction request contained a video. The video is divided into 1-second segments, and an embedding is generated for each segment. |
VideoEmbedding
Contains embedding data for a specific time segment of a video.
| Fields | |
|---|---|
start_offset_sec |
The start time of the video segment that this embedding represents, measured in seconds from the beginning of the video. |
end_offset_sec |
The end time of the video segment that this embedding represents, measured in seconds from the beginning of the video. |
embedding |
The 1024-dimension embedding vector for this video segment. |