Media embedding input format for large vision model embedding api.
The image bytes or Cloud Storage URI to generate the image embedding.
text
string
The text for generating the text embedding.
The video bytes or Cloud Storage URI to generate the video embedding.
Image
The image bytes or Cloud Storage URI to make the prediction on.
mimeType
string
The MIME type of the content of the image. Only the images in below listed MIME types are supported. - image/jpeg - image/png
data
Union type
data
can be only one of the following:bytesBase64Encoded
string
Base64 encoded bytes string representing the image.
gcsUri
string
JSON representation |
---|
{ "mimeType": string, // data "bytesBase64Encoded": string, "gcsUri": string // Union type } |
Video
The video bytes or Cloud Storage URI to make the prediction on.
Video configurations.
data
Union type
data
can be only one of the following:bytesBase64Encoded
string
Base64 encoded bytes string representing the video.
gcsUri
string
JSON representation |
---|
{
"videoSegmentConfig": {
object ( |
VideoSegmentConfig
Video segment configurations.
startOffsetSec
integer
The start offset of the video segment in seconds.
endOffsetSec
integer
The end offset of the video segment in seconds.
intervalSec
integer
The interval of the video for which the embedding will be generated. The minimum value for intervalSec is 4. If the interval is less than 4, an InvalidArgumentError will be returned. There is no limitations on the maximum value of the interval. However, if the interval is larger than min(video length, 120s), it will affect the quality of the generated embeddings.
JSON representation |
---|
{ "startOffsetSec": integer, "endOffsetSec": integer, "intervalSec": integer } |