VisionGenerativeModelInstance

Media generation input format for large vision model.

Fields
image object (Image)

The image bytes or Cloud Storage URI to make the prediction on. It is required for editing. Not needed for generation. This field will be used to determine whether the call is editing or generation.

prompt string

The text prompt for generating the images. This is required for both editing and generation.

mask object (Mask)

Masked field will be editied based on the text content provided. This can be either an image or a polygon. It should not be provided without images. Optional field for editing the images.

referenceImages[] object (ReferenceImage)

The reference images to be used for editing and customization capabilities. Imagen 3 Capability adds support for multiple reference images, each of which can be a mask, control, style, or subject image. Depending on the reference type, the reference_config field will be populated with the corresponding config.

JSON representation
{
  "image": {
    object (Image)
  },
  "prompt": string,
  "mask": {
    object (Mask)
  },
  "referenceImages": [
    {
      object (ReferenceImage)
    }
  ]
}

Image

Fields
mimeType string

The MIME type of the content of the image. Only the images in below listed MIME types are supported. - image/jpeg - image/png

data Union type
The image bytes or Cloud Storage URI to make the prediction on. data can be only one of the following:
bytesBase64Encoded string

Base64 encoded bytes string representing the image.

gcsUri string
JSON representation
{
  "mimeType": string,

  // data
  "bytesBase64Encoded": string,
  "gcsUri": string
  // Union type
}

Mask

Fields
data Union type
data can be only one of the following:
image object (Image)
polygonList object (BoundingPolyList)
JSON representation
{

  // data
  "image": {
    object (Image)
  },
  "polygonList": {
    object (BoundingPolyList)
  }
  // Union type
}

BoundingPolyList

Fields
polygons[] object (BoundingPoly)
JSON representation
{
  "polygons": [
    {
      object (BoundingPoly)
    }
  ]
}

ReferenceImage

A ReferenceImage is an image that is used to provide additional context for the image generation or editing.

Fields
referenceImage object (Image)

The actual image data of the reference image.

referenceId integer

The id of the reference image. This must be unique within the request.

referenceType enum (ReferenceType)

The type of the reference image.

reference_config Union type
A config describing the reference image. reference_config can be only one of the following:
maskImageConfig object (MaskImageConfig)

A config for a mask image.

controlImageConfig object (ControlImageConfig)

A config for a control image.

styleImageConfig object (StyleImageConfig)

A config for a style image.

subjectImageConfig object (SubjectImageConfig)

A config for a subject image.

JSON representation
{
  "referenceImage": {
    object (Image)
  },
  "referenceId": integer,
  "referenceType": enum (ReferenceType),

  // reference_config
  "maskImageConfig": {
    object (MaskImageConfig)
  },
  "controlImageConfig": {
    object (ControlImageConfig)
  },
  "styleImageConfig": {
    object (StyleImageConfig)
  },
  "subjectImageConfig": {
    object (SubjectImageConfig)
  }
  // Union type
}

MaskImageConfig

Config for masked image editing using Imagen 3 Capability

Fields
maskMode enum (MaskMode)

Mode used to generate the mask if mask is not provided.

dilation number

Dilation to be used with this Mask. This value is used to dilate the mask before applying the edit mode.

maskClasses[] integer

The segmentation classes which are used in the MASK_MODE_SEMANTIC mode.

JSON representation
{
  "maskMode": enum (MaskMode),
  "dilation": number,
  "maskClasses": [
    integer
  ]
}

ControlImageConfig

Config for control image used for editing.

Fields
controlType enum (ControlType)

type of control image.

enableControlImageComputation boolean

Whether to compute the control image for the request.

superpixelRegionSize integer

Region size of the superpixel control image.

superpixelRuler number

Ruler of the superpixel control image.

JSON representation
{
  "controlType": enum (ControlType),
  "enableControlImageComputation": boolean,
  "superpixelRegionSize": integer,
  "superpixelRuler": number
}

StyleImageConfig

Config for style image used for editing.

Fields
styleDescription string

description of the style image.

JSON representation
{
  "styleDescription": string
}

SubjectImageConfig

Config for subject image used for editing.

Fields
subjectDescription string

description of the subject image.

subjectType enum (SubjectType)

type of subject image.

JSON representation
{
  "subjectDescription": string,
  "subjectType": enum (SubjectType)
}