Package google.cloud.aiplatform.v1beta1.schema.predict.prediction

Index

NamedBoundingBox (message)
SafetyAttributes (message)
SafetyAttributes.DetectedLabels (message)
SafetyAttributes.DetectedLabels.BoundingBox (message)
SafetyAttributes.DetectedLabels.Entity (message)
SemanticFilterResponse (message)
TextEmbedding (message)
TextEmbedding.Statistics (message)
TextEmbeddingPredictionResult (message)
VideoGenerationModelResult (message)
VirtualTryOnModelResultProto (message)
VirtualTryOnModelResultProto.Image (message)
VisionEmbeddingModelResult (message)
VisionEmbeddingModelResult.VideoEmbedding (message)
VisionGenerativeModelResult (message)
VisionGenerativeModelResult.Image (message)
VisionReasoningModelResult (message)

NamedBoundingBox

NamedBoundingBox to track an annotated bounding box.

Fields
`classes[]`	`string` Annotated classes.
`entities[]`	`string` Annotated entities.
`scores[]`	`float` Annotated scores. Scores are normalized between [0, 1].
`x1`	`float` The top-left (x1, y1) corner's unnormalized coordinate.
`x2`	`float`
`y1`	`float` The bottom-right (y1, y2) corner's unnormalized coordinate.
`y2`	`float`

SafetyAttributes

Fields

Fields
`categories[]`	`string` List of RAI categories.
`scores[]`	`float` List of RAI scores.
`detected_labels[]`	`DetectedLabels` List of detected labels

categories[]

string

List of RAI categories.

scores[]

float

List of RAI scores.

detected_labels[]

DetectedLabels

List of detected labels

DetectedLabels

Filters which return labels with confidence scores.

Fields

Fields
`entities[]`	`Entity` The list of detected entities for the rai signal.
`rai_category`	`string` The RAI category for the deteceted labels.

entities[]

Entity

The list of detected entities for the rai signal.

rai_category

string

The RAI category for the deteceted labels.

BoundingBox

An integer bounding box of the original pixel size for the detected labels.

Fields
`x1`	`int32` The X coordinate of the top-left corner, in pixels.
`y1`	`int32` The Y coordinate of the top-left corner, in pixels.
`x2`	`int32` The X coordinate of the bottom-right corner, in pixels.
`y2`	`int32` The Y coordinate of the bottom-right corner, in pixels.

Entity

The properties for a detected entity from the rai signal.

Fields
`mid`	`string` MID of the label
`description`	`string` Description of the label
`score`	`float` Confidence score of the label
`bounding_box`	`BoundingBox` Bounding box of the label
`iou_score`	`float` The intersection ratio between the detection bounding box and the mask.

SemanticFilterResponse

SemanticFilterResponse tracks the semantic filtering results if user turns on the semantic filtering in LVM image editing's editConfig.

Fields

Fields
`named_bounding_boxes[]`	`NamedBoundingBox` If semantic filtering is not passed, a list of named bounding boxes will be populated to report users the detected objects that failed semantic filtering.
`passed_semantic_filter`	`bool` Whether the semantic filtering is passed.

named_bounding_boxes[]

NamedBoundingBox

If semantic filtering is not passed, a list of named bounding boxes will be populated to report users the detected objects that failed semantic filtering.

passed_semantic_filter

bool

Whether the semantic filtering is passed.

TextEmbedding

An embedding is a vector (list) of floating-point numbers that represents the semantic meaning of text. Embeddings can be used to compare text for similarity, classify text, or cluster text. Text with similar meaning will have similar embedding vectors.

Fields

Fields
`values[]`	`float` The embedding vector. The size of the vector is fixed and determined by the model used for embedding generation.
`statistics`	`Statistics` Statistics about the input text.

values[]

float

The embedding vector. The size of the vector is fixed and determined by the model used for embedding generation.

statistics

Statistics

Statistics about the input text.

Statistics

Statistics about the input text.

Fields

Fields
`token_count`	`int32` The number of tokens in the input text.
`truncated`	`bool` Whether the input text was truncated. If true, the embedding was generated from a truncated version of the input text. This can happen if the input text was longer than the model's input token limit.

token_count

int32

The number of tokens in the input text.

truncated

bool

Whether the input text was truncated. If true, the embedding was generated from a truncated version of the input text. This can happen if the input text was longer than the model's input token limit.

TextEmbeddingPredictionResult

Prediction output format for Text Embedding. LINT.IfChange Represents the prediction result for a text embedding request.

Fields

Fields
`embeddings`	`TextEmbedding` The embedding generated from the input text.

embeddings

TextEmbedding

The embedding generated from the input text.

VideoGenerationModelResult

Prediction result from a video generation model. When you request a prediction from a video generation model, the model generates videos based on your input and returns URIs to these videos in Google Cloud Storage.

Fields

Fields
`gcs_uris[]`	`string` A list of Google Cloud Storage URIs for generated videos. For each input instance in your prediction request, the model may generate one or more videos. This field provides the Google Cloud Storage URIs for each of these videos.

gcs_uris[]

string

A list of Google Cloud Storage URIs for generated videos. For each input instance in your prediction request, the model may generate one or more videos. This field provides the Google Cloud Storage URIs for each of these videos.

VirtualTryOnModelResultProto

Represents the output of a Virtual Try-On prediction.

Fields

Fields
`images[]`	`Image` A list of generated images. The number of images returned is equal to the `sample_count` parameter provided in the request.

images[]

Image

A list of generated images. The number of images returned is equal to the sample_count parameter provided in the request.

Image

Contains a generated image or information about why the image was filtered out.

Fields
`mime_type`	`string` The MIME type of the generated image. Supported values are: * `image/png` * `image/jpeg`
Union field `data`. The generated image data or filtering reason. `data` can be only one of the following:
`bytes_base64_encoded`	`string` The generated image encoded as a base64 encoded bytes string.
`gcs_uri`	`string` The Google Cloud Storage URI where the generated image is stored.
`rai_filtered_reason`	`string` The reason why the generated image was filtered out by Responsible AI checks. If this field is present, no image is returned.

VisionEmbeddingModelResult

The prediction result for a large vision model embedding request. An embedding is a vectorized representation of data such as image, text or video. The embeddings produced by this model can be used for tasks such as image retrieval, similarity comparison, and classification. The embedding vectors have 1024 dimensions.

Fields

Fields
`image_embedding`	`ListValue` The embedding generated from the input image. This field is populated if the prediction request contained an image.
`text_embedding`	`ListValue` The embedding generated from the input text. This field is populated if the prediction request contained text.
`video_embeddings[]`	`VideoEmbedding` The embeddings generated from the input video. This field is populated if the prediction request contained a video. The video is divided into 1-second segments, and an embedding is generated for each segment.

image_embedding

ListValue

The embedding generated from the input image. This field is populated if the prediction request contained an image.

text_embedding

ListValue

The embedding generated from the input text. This field is populated if the prediction request contained text.

video_embeddings[]

VideoEmbedding

The embeddings generated from the input video. This field is populated if the prediction request contained a video. The video is divided into 1-second segments, and an embedding is generated for each segment.

VideoEmbedding

Contains embedding data for a specific time segment of a video.

Fields

Fields
`start_offset_sec`	`int32` The start time of the video segment that this embedding represents, measured in seconds from the beginning of the video.
`end_offset_sec`	`int32` The end time of the video segment that this embedding represents, measured in seconds from the beginning of the video.
`embedding`	`ListValue` The 1024-dimension embedding vector for this video segment.

start_offset_sec

int32

The start time of the video segment that this embedding represents, measured in seconds from the beginning of the video.

end_offset_sec

int32

The end time of the video segment that this embedding represents, measured in seconds from the beginning of the video.

embedding

ListValue

The 1024-dimension embedding vector for this video segment.

VisionGenerativeModelResult

Fields

Fields
`images[]`	`Image` List of images bytes or gcs uris of the generated images.

images[]

Image

List of images bytes or gcs uris of the generated images.

Image

Fields
`mime_type`	`string` The MIME type of the content of the image. Only the images in below listed MIME types are supported. - image/jpeg - image/gif - image/png - image/webp - image/bmp - image/tiff - image/vnd.microsoft.icon
`prompt`	`string` The rewritten prompt used for the image generation.
Union field `data`. The image bytes or Cloud Storage URI to make the prediction on. `data` can be only one of the following:
`bytes_base64_encoded`	`string` Base64 encoded bytes string representing the image.
`gcs_uri`	`string`
`rai_filtered_reason`	`string` The reason generated images get filtered.
`content_type`	`string` Input object content type
`semantic_filter_response`	`SemanticFilterResponse` Semantic filter results. This will report to users when semantic filter is turned on in editConfig and used for image inpainting.
`safety_attributes`	`SafetyAttributes` Safety attributes scores of the content.

VisionReasoningModelResult

The response format for lvm image and video captioning is as follows: 1. Image captioning: From the lvm image2text(PaLi) model, the responses are descriptions of the same image. 2. Video captioning: From the lvm video2text(Penguin) model, the responses are different segments within the same video. The response also contains the start and end offsets of the video segment. Video captioning response format: "[start_offset, end_offset) - text_response".

Fields

Fields
`text_responses[]`	`string` List of text responses in the given text language.

text_responses[]

string

List of text responses in the given text language.

Package google.cloud.aiplatform.v1beta1.schema.predict.prediction Stay organized with collections Save and categorize content based on your preferences.

Index

NamedBoundingBox

SafetyAttributes

DetectedLabels

BoundingBox

Entity

SemanticFilterResponse

TextEmbedding

Statistics

TextEmbeddingPredictionResult

VideoGenerationModelResult

VirtualTryOnModelResultProto

Image

VisionEmbeddingModelResult

VideoEmbedding

VisionGenerativeModelResult

Image

VisionReasoningModelResult

Package google.cloud.aiplatform.v1beta1.schema.predict.prediction