Format for Gemini examples used for Vertex Multimodal datasets.
modelstring
Optional. The fully qualified name of the publisher model or tuned model endpoint to use.
Publisher model format: projects/{project}/locations/{location}/publishers/*/models/*
Tuned model endpoint format: projects/{project}/locations/{location}/endpoints/{endpoint}
Required. The content of the current conversation with the model.
For single-turn queries, this is a single instance. For multi-turn queries, this is a repeated field that contains conversation history + latest request.
cachedContentstring
Optional. The name of the cached content used as context to serve the prediction. Note: only used in explicit caching, where users can have control over caching (e.g. what content to cache) and enjoy guaranteed cost savings. Format: projects/{project}/locations/{location}/cachedContents/{cachedContent}
Optional. A list of Tools the model may use to generate the next response.
A Tool is a piece of code that enables the system to interact with external systems to perform an action, or set of actions, outside of knowledge and scope of the model.
Optional. Tool config. This config is shared for all tools provided in the request.
labelsmap (key: string, value: string)
Optional. The labels with user-defined metadata for the request. It is used for billing and reporting only.
label keys and values can be no longer than 63 characters (Unicode codepoints) and can only contain lowercase letters, numeric characters, underscores, and dashes. International characters are allowed. label values are optional. label keys must start with a letter.
Optional. Per request settings for blocking unsafe content. Enforced on GenerateContentResponse.candidates.
Optional. Settings for prompt and response sanitization using the Model Armor service. If supplied, safetySettings must not be supplied.
Optional. Generation config.
Optional. The user provided system instructions for the model. Note: only text should be used in parts and content in each part will be in a separate paragraph.
| JSON representation |
|---|
{ "model": string, "contents": [ { object ( |
Tool
Tool details that the model may use to generate response.
A Tool is a piece of code that enables the system to interact with external systems to perform an action, or set of actions, outside of knowledge and scope of the model. A Tool object should contain exactly one type of Tool (e.g FunctionDeclaration, Retrieval or GoogleSearchRetrieval).
Optional. Function tool type. One or more function declarations to be passed to the model along with the current user query. Model may decide to call a subset of these functions by populating FunctionCall in the response. user should provide a FunctionResponse for each function call in the next turn. Based on the function responses, Model will generate the final response back to the user. Maximum 512 function declarations can be provided.
Optional. Retrieval tool type. System will always execute the provided retrieval tool(s) to get external knowledge to answer the prompt. Retrieval results are presented to the model for generation.
Optional. GoogleSearch tool type. Tool to support Google Search in Model. Powered by Google.
Optional. Specialized retrieval tool that is powered by Google Search.
Optional. GoogleMaps tool type. Tool to support Google Maps in Model.
Optional. Tool to support searching public web data, powered by Agent Platform Search and Sec4 compliance.
Optional. CodeExecution tool type. Enables the model to execute code as part of generation.
Optional. Tool to support URL context retrieval.
Optional. Tool to support the model interacting directly with the computer. If enabled, it automatically populates computer-use specific Function Declarations.
| JSON representation |
|---|
{ "functionDeclarations": [ { object ( |
Retrieval
Defines a retrieval tool that model can call to access external knowledge.
disableAttribution
(deprecated)boolean
Optional. Deprecated. This option is no longer supported.
sourceUnion type
source can be only one of the following:Set to use data source powered by Agent Platform Search.
Set to use data source powered by Vertex RAG store. user data is uploaded via the VertexRagDataService.
| JSON representation |
|---|
{ "disableAttribution": boolean, // source "vertexAiSearch": { object ( |
VertexAISearch
Retrieve from Agent Platform Search datastore or engine for grounding. datastore and engine are mutually exclusive. See https://cloud.google.com/products/agent-builder
datastorestring
Optional. Fully-qualified Agent Platform Search data store resource id. Format: projects/{project}/locations/{location}/collections/{collection}/dataStores/{dataStore}
enginestring
Optional. Fully-qualified Agent Platform Search engine resource id. Format: projects/{project}/locations/{location}/collections/{collection}/engines/{engine}
maxResultsinteger
Optional. Number of search results to return per query. The default value is 10. The maximumm allowed value is 10.
filterstring
Optional. Filter strings to be passed to the search API.
Specifications that define the specific DataStores to be searched, along with configurations for those data stores. This is only considered for Engines with multiple data stores. It should only be set if engine is used.
| JSON representation |
|---|
{
"datastore": string,
"engine": string,
"maxResults": integer,
"filter": string,
"dataStoreSpecs": [
{
object ( |
DataStoreSpec
Define data stores within engine to filter on in a search call and configurations for those data stores. For more information, see https://cloud.google.com/generative-ai-app-builder/docs/reference/rpc/google.cloud.discoveryengine.v1#datastorespec
dataStorestring
Full resource name of DataStore, such as Format: projects/{project}/locations/{location}/collections/{collection}/dataStores/{dataStore}
filterstring
Optional. Filter specification to filter documents in the data store specified by dataStore field. For more information on filtering, see Filtering
| JSON representation |
|---|
{ "dataStore": string, "filter": string } |
VertexRagStore
Retrieve from Vertex RAG Store for grounding.
ragCorpora[]
(deprecated)string
Optional. Deprecated. Please use ragResources instead.
Optional. The representation of the rag source. It can be used to specify corpus only or ragfiles. Currently only support one corpus or multiple files from one corpus. In the future we may open up multiple corpora support.
Optional. The retrieval config for the Rag query.
storeContextboolean
Optional. Currently only supported for Gemini Multimodal Live API.
In Gemini Multimodal Live API, if storeContext bool is specified, Gemini will leverage it to automatically memorize the interactions between the client and Gemini, and retrieve context when needed to augment the response generation for users' ongoing and future interactions.
similarityTopK
(deprecated)integer
Optional. Number of top k results to return from the selected corpora.
vectorDistanceThreshold
(deprecated)number
Optional. Only return results with vector distance smaller than the threshold.
| JSON representation |
|---|
{ "ragCorpora": [ string ], "ragResources": [ { object ( |
RagResource
The definition of the Rag resource.
ragCorpusstring
Optional. RagCorpora resource name. Format: projects/{project}/locations/{location}/ragCorpora/{ragCorpus}
ragFileIds[]string
Optional. ragFileId. The files should be in the same ragCorpus set in ragCorpus field.
| JSON representation |
|---|
{ "ragCorpus": string, "ragFileIds": [ string ] } |
RagRetrievalConfig
Specifies the context retrieval config.
topKinteger
Optional. The number of contexts to retrieve.
Optional. Config for Hybrid Search.
Optional. Config for filters.
Optional. Config for ranking and reranking.
| JSON representation |
|---|
{ "topK": integer, "hybridSearch": { object ( |
HybridSearch
Config for Hybrid Search.
alphanumber
Optional. Alpha value controls the weight between dense and sparse vector search results. The range is [0, 1], while 0 means sparse vector search only and 1 means dense vector search only. The default value is 0.5 which balances sparse and dense vector search equally.
| JSON representation |
|---|
{ "alpha": number } |
Filter
Config for filters.
metadataFilterstring
Optional. String for metadata filtering.
vector_db_thresholdUnion type
vector_db_threshold can be only one of the following:vectorDistanceThresholdnumber
Optional. Only returns contexts with vector distance smaller than the threshold.
vectorSimilarityThresholdnumber
Optional. Only returns contexts with vector similarity larger than the threshold.
| JSON representation |
|---|
{ "metadataFilter": string, // vector_db_threshold "vectorDistanceThreshold": number, "vectorSimilarityThreshold": number // Union type } |
Ranking
Config for ranking and reranking.
ranking_configUnion type
ranking_config can be only one of the following:Optional. Config for Rank service.
Optional. Config for LlmRanker.
| JSON representation |
|---|
{ // ranking_config "rankService": { object ( |
RankService
Config for Rank service.
modelNamestring
Optional. The model name of the rank service. Format: semantic-ranker-512@latest
| JSON representation |
|---|
{ "modelName": string } |
LlmRanker
Config for LlmRanker.
modelNamestring
Optional. The model name used for ranking. See Supported models.
| JSON representation |
|---|
{ "modelName": string } |
GoogleSearch
GoogleSearch tool type. Tool to support Google Search in Model. Powered by Google.
excludeDomains[]string
Optional. List of domains to be excluded from the search results. The default limit is 2000 domains. Example: ["amazon.com", "facebook.com"].
Optional. Sites with confidence level chosen & above this value will be blocked from the search results.
| JSON representation |
|---|
{
"excludeDomains": [
string
],
"blockingConfidence": enum ( |
GoogleSearchRetrieval
Tool to retrieve public web data for grounding, powered by Google.
Specifies the dynamic retrieval configuration for the given source.
| JSON representation |
|---|
{
"dynamicRetrievalConfig": {
object ( |
GoogleMaps
Tool to retrieve public maps data for grounding, powered by Google.
enableWidgetboolean
Optional. If true, include the widget context token in the response.
| JSON representation |
|---|
{ "enableWidget": boolean } |
CodeExecution
This type has no fields.
Tool that executes code generated by the model, and automatically returns the result to the model.
See also ExecutableCode and CodeExecutionResult, which are input and output to this tool.
UrlContext
This type has no fields.
Tool to support URL context.
ComputerUse
Tool to support computer use.
Required. The environment being operated.
excludedPredefinedFunctions[]string
Optional. By default, predefined functions are included in the final model call. Some of them can be explicitly excluded from being automatically included. This can serve two purposes: 1. Using a more restricted / different action space. 2. Improving the definitions / instructions of predefined functions.
| JSON representation |
|---|
{
"environment": enum ( |
ToolConfig
Tool config. This config is shared for all tools provided in the request.
Optional. Function calling config.
Optional. Retrieval config.
| JSON representation |
|---|
{ "functionCallingConfig": { object ( |
RetrievalConfig
Retrieval config.
The location of the user.
languageCodestring
The language code of the user.
| JSON representation |
|---|
{
"latLng": {
object ( |
LatLng
An object that represents a latitude/longitude pair. This is expressed as a pair of doubles to represent degrees latitude and degrees longitude. Unless specified otherwise, this object must conform to the WGS84 standard. Values must be within normalized ranges.
latitudenumber
The latitude in degrees. It must be in the range [-90.0, +90.0].
longitudenumber
The longitude in degrees. It must be in the range [-180.0, +180.0].
| JSON representation |
|---|
{ "latitude": number, "longitude": number } |
SafetySetting
A safety setting that affects the safety-blocking behavior.
A SafetySetting consists of a harm category and a threshold for that category.
Required. The harm category to be blocked.
Required. The threshold for blocking content. If the harm probability exceeds this threshold, the content will be blocked.
Optional. The method for blocking content. If not specified, the default behavior is to use the probability score.
| JSON representation |
|---|
{ "category": enum ( |
ModelArmorConfig
Configuration for Model Armor.
Model Armor is a Google Cloud service that provides safety and security filtering for prompts and responses. It helps protect your AI applications from risks such as harmful content, sensitive data leakage, and prompt injection attacks.
promptTemplateNamestring
Optional. The resource name of the Model Armor template to use for prompt screening.
A Model Armor template is a set of customized filters and thresholds that define how Model Armor screens content. If specified, Model Armor will use this template to check the user's prompt for safety and security risks before it is sent to the model.
The name must be in the format projects/{project}/locations/{location}/templates/{template}.
responseTemplateNamestring
Optional. The resource name of the Model Armor template to use for response screening.
A Model Armor template is a set of customized filters and thresholds that define how Model Armor screens content. If specified, Model Armor will use this template to check the model's response for safety and security risks before it is returned to the user.
The name must be in the format projects/{project}/locations/{location}/templates/{template}.
| JSON representation |
|---|
{ "promptTemplateName": string, "responseTemplateName": string } |
GenerationConfig
Configuration for content generation.
This message contains all the parameters that control how the model generates content. It allows you to influence the randomness, length, and structure of the output.
stopSequences[]string
Optional. A list of character sequences that will stop the model from generating further tokens. If a stop sequence is generated, the output will end at that point. This is useful for controlling the length and structure of the output. For example, you can use ["\n", "###"] to stop generation at a new line or a specific marker.
responseMimeTypestring
Optional. The IANA standard MIME type of the response. The model will generate output that conforms to this MIME type. Supported values include 'text/plain' (default) and 'application/json'. The model needs to be prompted to output the appropriate response type, otherwise the behavior is undefined.
Optional. The modalities of the response. The model will generate a response that includes all the specified modalities. For example, if this is set to [TEXT, IMAGE], the response will include both text and an image.
Optional. Configuration for thinking features. An error will be returned if this field is set for models that don't support thinking.
Optional. Config for model selection.
temperaturenumber
Optional. Controls the randomness of the output. A higher temperature results in more creative and diverse responses, while a lower temperature makes the output more predictable and focused. The valid range is (0.0, 2.0].
topPnumber
Optional. Specifies the nucleus sampling threshold. The model considers only the smallest set of tokens whose cumulative probability is at least topP. This helps generate more diverse and less repetitive responses. For example, a topP of 0.9 means the model considers tokens until the cumulative probability of the tokens to select from reaches 0.9. It's recommended to adjust either temperature or topP, but not both.
topKnumber
Optional. Specifies the top-k sampling threshold. The model considers only the top k most probable tokens for the next token. This can be useful for generating more coherent and less random text. For example, a topK of 40 means the model will choose the next word from the 40 most likely words.
candidateCountinteger
Optional. The number of candidate responses to generate.
A higher candidateCount can provide more options to choose from, but it also consumes more resources. This can be useful for generating a variety of responses and selecting the best one.
maxOutputTokensinteger
Optional. The maximum number of tokens to generate in the response.
A token is approximately four characters. The default value varies by model. This parameter can be used to control the length of the generated text and prevent overly long responses.
responseLogprobsboolean
Optional. If set to true, the log probabilities of the output tokens are returned.
log probabilities are the logarithm of the probability of a token appearing in the output. A higher log probability means the token is more likely to be generated. This can be useful for analyzing the model's confidence in its own output and for debugging.
logprobsinteger
Optional. The number of top log probabilities to return for each token.
This can be used to see which other tokens were considered likely candidates for a given position. A higher value will return more options, but it will also increase the size of the response.
presencePenaltynumber
Optional. Penalizes tokens that have already appeared in the generated text. A positive value encourages the model to generate more diverse and less repetitive text. Valid values can range from [-2.0, 2.0].
frequencyPenaltynumber
Optional. Penalizes tokens based on their frequency in the generated text. A positive value helps to reduce the repetition of words and phrases. Valid values can range from [-2.0, 2.0].
seedinteger
Optional. A seed for the random number generator.
By setting a seed, you can make the model's output mostly deterministic. For a given prompt and parameters (like temperature, topP, etc.), the model will produce the same response every time. However, it's not a guaranteed absolute deterministic behavior. This is different from parameters like temperature, which control the level of randomness. seed ensures that the "random" choices the model makes are the same on every run, making it essential for testing and ensuring reproducible results.
Optional. Lets you to specify a schema for the model's response, ensuring that the output conforms to a particular structure. This is useful for generating structured data such as JSON. The schema is a subset of the OpenAPI 3.0 schema object object.
When this field is set, you must also set the responseMimeType to application/json.
Optional. When this field is set, responseSchema must be omitted and responseMimeType must be set to application/json.
Optional. Routing configuration.
audioTimestampboolean
Optional. If enabled, audio timestamps will be included in the request to the model. This can be useful for synchronizing audio with other modalities in the response.
Optional. The token resolution at which input media content is sampled. This is used to control the trade-off between the quality of the response and the number of tokens used to represent the media. A higher resolution allows the model to perceive more detail, which can lead to a more nuanced response, but it will also use more tokens. This does not affect the image dimensions sent to the model.
Optional. The speech generation config.
enableAffectiveDialogboolean
Optional. If enabled, the model will detect emotions and adapt its responses accordingly. For example, if the model detects that the user is frustrated, it may provide a more empathetic response.
Optional. Config for image generation features.
| JSON representation |
|---|
{ "stopSequences": [ string ], "responseMimeType": string, "responseModalities": [ enum ( |
RoutingConfig
The configuration for routing the request to a specific model. This can be used to control which model is used for the generation, either automatically or by specifying a model name.
routing_configUnion type
routing_config can be only one of the following:In this mode, the model is selected automatically based on the content of the request.
In this mode, the model is specified manually.
| JSON representation |
|---|
{ // routing_config "autoMode": { object ( |
AutoRoutingMode
The configuration for automated routing.
When automated routing is specified, the routing will be determined by the pretrained routing model and customer provided model routing preference.
The model routing preference.
| JSON representation |
|---|
{
"modelRoutingPreference": enum ( |
ManualRoutingMode
The configuration for manual routing.
When manual routing is specified, the model will be selected based on the model name provided.
modelNamestring
The name of the model to use. Only public LLM models are accepted.
| JSON representation |
|---|
{ "modelName": string } |
SpeechConfig
Configuration for speech generation.
The configuration for the voice to use.
languageCodestring
Optional. The language code (ISO 639-1) for the speech synthesis.
The configuration for a multi-speaker text-to-speech request. This field is mutually exclusive with voiceConfig.
| JSON representation |
|---|
{ "voiceConfig": { object ( |
VoiceConfig
Configuration for a voice.
voice_configUnion type
voice_config can be only one of the following:The configuration for a prebuilt voice.
Optional. The configuration for a replicated voice. This enables users to replicate a voice from an audio sample.
| JSON representation |
|---|
{ // voice_config "prebuiltVoiceConfig": { object ( |
PrebuiltVoiceConfig
Configuration for a prebuilt voice.
voiceNamestring
The name of the prebuilt voice to use.
| JSON representation |
|---|
{ "voiceName": string } |
ReplicatedVoiceConfig
The configuration for the replicated voice to use.
mimeTypestring
Optional. The mimetype of the voice sample. The only currently supported value is audio/wav. This represents 16-bit signed little-endian wav data, with a 24kHz sampling rate. mimeType will default to audio/wav if not set.
Optional. The sample of the custom voice.
A base64-encoded string.
| JSON representation |
|---|
{ "mimeType": string, "voiceSampleAudio": string } |
MultiSpeakerVoiceConfig
Configuration for a multi-speaker text-to-speech request.
Required. A list of configurations for the voices of the speakers. Exactly two speaker voice configurations must be provided.
| JSON representation |
|---|
{
"speakerVoiceConfigs": [
{
object ( |
SpeakerVoiceConfig
Configuration for a single speaker in a multi-speaker setup.
speakerstring
Required. The name of the speaker. This should be the same as the speaker name used in the prompt.
Required. The configuration for the voice of this speaker.
| JSON representation |
|---|
{
"speaker": string,
"voiceConfig": {
object ( |
ThinkingConfig
Configuration for the model's thinking features.
"Thinking" is a process where the model breaks down a complex task into smaller, manageable steps. This allows the model to reason about the task, plan its approach, and execute the plan to generate a high-quality response.
includeThoughtsboolean
Optional. If true, the model will include its thoughts in the response. "Thoughts" are the intermediate steps the model takes to arrive at the final response. They can provide insights into the model's reasoning process and help with debugging. If this is true, thoughts are returned only when available.
thinkingBudgetinteger
Optional. The token budget for the model's thinking process. The model will make a best effort to stay within this budget. This can be used to control the trade-off between response quality and latency.
Optional. The number of thoughts tokens that the model should generate.
| JSON representation |
|---|
{
"includeThoughts": boolean,
"thinkingBudget": integer,
"thinkingLevel": enum ( |
ModelConfig
Config for model selection.
Required. feature selection preference.
| JSON representation |
|---|
{
"featureSelectionPreference": enum ( |
ImageConfig
Configuration for image generation.
This message allows you to control various aspects of image generation, such as the output format, aspect ratio, and whether the model can generate images of people.
Optional. The image output format for generated images.
aspectRatiostring
Optional. The desired aspect ratio for the generated images. The following aspect ratios are supported:
"1:1" "2:3", "3:2" "3:4", "4:3" "4:5", "5:4" "9:16", "16:9" "21:9"
Optional. Controls whether the model can generate people.
imageSizestring
Optional. Specifies the size of generated images. Supported values are 1K, 2K, 4K. If not specified, the model will use default value 1K.
| JSON representation |
|---|
{ "imageOutputOptions": { object ( |
ImageOutputOptions
The image output format for generated images.
mimeTypestring
Optional. The image format that the output should be saved as.
compressionQualityinteger
Optional. The compression quality of the output image.
| JSON representation |
|---|
{ "mimeType": string, "compressionQuality": integer } |