Tool: search
Perform a search on ingested data in Google owned data stores
The following sample demonstrate how to use curl to invoke the search MCP tool.
| Curl Request |
|---|
curl --location 'https://discoveryengine.googleapis.com/mcp' \ --header 'content-type: application/json' \ --header 'accept: application/json, text/event-stream' \ --data '{ "method": "tools/call", "params": { "name": "search", "arguments": { // provide these details according to the tool's MCP specification } }, "jsonrpc": "2.0", "id": 1 }' |
Input Schema
Request message for SearchService.Search method.
SearchRequest
| JSON representation |
|---|
{ "servingConfig": string, "branch": string, "query": string, "pageCategories": [ string ], "imageQuery": { object ( |
| Fields | |
|---|---|
servingConfig |
Required. The resource name of the Search serving config, such as |
branch |
The branch resource name, such as Use |
query |
Raw search query. |
pageCategories[] |
Optional. The categories associated with a category page. Must be set for category navigation queries to achieve good search quality. The format should be the same as If the field is empty, it will not be used by the browse model. If the field contains more than one element, only the first element will be used. To represent full path of a category, use '>' character to separate different hierarchies. If '>' is part of the category name, replace it with other character(s). For example, |
imageQuery |
Raw image query. |
pageSize |
Maximum number of
If this field is negative, an |
pageToken |
A page token received from a previous When paginating, all other parameters provided to |
offset |
A 0-indexed integer that specifies the current offset (that is, starting result location, amongst the If this field is negative, an A large offset may be capped to a reasonable threshold. |
oneBoxPageSize |
The maximum number of results to return for OneBox. This applies to each OneBox type individually. Default number is 10. |
dataStoreSpecs[] |
Specifications that define the specific |
filter |
The filter syntax consists of an expression language for constructing a predicate from one or more fields of the documents being filtered. Filter expression is case-sensitive. If this field is unrecognizable, an Filtering in Vertex AI Search is done by mapping the LHS filter key to a key property defined in the Vertex AI Search backend -- this mapping is defined by the customer in their schema. For example a media customer might have a field 'name' in their schema. In this case the filter would look like this: filter --> name:'ANY("king kong")' For more information about filtering including syntax and filter operators, see Filter |
canonicalFilter |
The default filter that is applied when a user performs a search without checking any filters on the search page. The filter applied to every search request when quality improvement such as query expansion is needed. In the case a query does not have a sufficient amount of results this filter will be used to determine whether or not to enable the query expansion flow. The original filter will still be used for the query expanded search. This field is strongly recommended to achieve high search quality. For more information about filter syntax, see |
orderBy |
The order in which documents are returned. Documents can be ordered by a field in an For more information on ordering the website search results, see Order web search results. For more information on ordering the healthcare search results, see Order healthcare search results. If this field is unrecognizable, an |
userInfo |
Information about the end user. Highly recommended for analytics and personalization. |
languageCode |
The BCP-47 language code, such as "en-US" or "sr-Latn". For more information, see Standard fields. This field helps to better interpret the query. If a value isn't specified, the query language code is automatically detected, which may not be accurate. |
regionCode |
The Unicode country/region code (CLDR) of a location, such as "US" and "419". For more information, see Standard fields. If set, then results will be boosted based on the region_code provided. |
facetSpecs[] |
Facet specifications for faceted search. If empty, no facets are returned. A maximum of 100 values are allowed. Otherwise, an |
boostSpec |
Boost specification to boost certain documents. For more information on boosting, see Boosting |
params |
Additional search parameters. For public website search only, supported values are:
For available codes see Country Codes
An object containing a list of |
queryExpansionSpec |
The query expansion specification that specifies the conditions under which query expansion occurs. |
spellCorrectionSpec |
The spell correction specification that specifies the mode under which spell correction takes effect. |
userPseudoId |
Optional. A unique identifier for tracking visitors. For example, this could be implemented with an HTTP cookie, which should be able to uniquely identify a visitor on a single device. This unique identifier should not change if the visitor logs in or out of the website. This field should NOT have a fixed value such as This should be the same identifier as The field must be a UTF-8 encoded string with a length limit of 128 characters. Otherwise, an |
useLatestData |
Uses the Engine, ServingConfig and Control freshly read from the database. Note: this skips config cache and introduces dependency on databases, which could significantly increase the API latency. It should only be used for testing, but not serving end users. |
contentSearchSpec |
A specification for configuring the behavior of content search. |
embeddingSpec |
Uses the provided embedding to do additional semantic document retrieval. The retrieval is based on the dot product of If |
rankingExpression |
Optional. The ranking expression controls the customized ranking on retrieval documents. This overrides If
Supported functions:
Function variables:
Example ranking expression: If document has an embedding field doc_embedding, the ranking expression could be If
Here are a few examples of ranking formulas that use the supported ranking expression types:
The following signals are supported:
|
rankingExpressionBackend |
Optional. The backend to use for the ranking expression evaluation. |
safeSearch |
Whether to turn on safe search. This is only supported for website search. |
userLabels |
The user labels applied to a resource must meet the following requirements:
See Google Cloud Document for more details. An object containing a list of |
naturalLanguageQueryUnderstandingSpec |
Optional. Config for natural language query understanding capabilities, such as extracting structured field filters from the query. Refer to this documentation for more information. If |
searchAsYouTypeSpec |
Search as you type configuration. Only supported for the |
customFineTuningSpec |
Custom fine tuning configs. If set, it has higher priority than the configs set in |
displaySpec |
Optional. Config for display feature, like match highlighting on search results. |
crowdingSpecs[] |
Optional. Crowding specifications for improving result diversity. If multiple CrowdingSpecs are specified, crowding will be evaluated on each unique combination of the |
session |
The session resource name. Optional. Session allows users to do multi-turn /search API calls or coordination between /search API calls and /answer API calls. Example #1 (multi-turn /search API calls): Call /search API with the session ID generated in the first call. Here, the previous search query gets considered in query standing. I.e., if the first query is "How did Alphabet do in 2022?" and the current query is "How about 2023?", the current query will be interpreted as "How did Alphabet do in 2023?". Example #2 (coordination between /search API calls and /answer API calls): Call /answer API with the session ID generated in the first call. Here, the answer generation happens in the context of the search results from the first search call. Multi-turn Search feature is currently at private GA stage. Please use v1alpha or v1beta version instead before we launch this feature to public GA. Or ask for allowlisting through Google Support team. |
sessionSpec |
Session specification. Can be used only when |
relevanceThreshold |
The global relevance threshold of the search results. Defaults to Google defined threshold, leveraging a balance of precision and recall to deliver both highly accurate results and comprehensive coverage of relevant information. If more granular relevance filtering is required, use the This feature is not supported for healthcare search. |
relevanceFilterSpec |
Optional. The granular relevance filtering specification. If not specified, the global This feature is currently supported only for custom and site search. |
personalizationSpec |
The specification for personalization. Notice that if both |
relevanceScoreSpec |
Optional. The specification for returning the relevance score. |
searchAddonSpec |
Optional. SearchAddonSpec is used to disable add-ons for search as per new repricing model. This field is only supported for search requests. |
ImageQuery
| JSON representation |
|---|
{ // Union field |
| Fields | |
|---|---|
Union field
|
|
imageBytes |
Base64 encoded image bytes. Supported image formats: JPEG, PNG, and BMP. |
DataStoreSpec
| JSON representation |
|---|
{
"dataStore": string,
"filter": string,
"boostSpec": {
object ( |
| Fields | |
|---|---|
dataStore |
Required. Full resource name of |
filter |
Optional. Filter specification to filter documents in the data store specified by data_store field. For more information on filtering, see Filtering |
boostSpec |
Optional. Boost specification to boost certain documents. For more information on boosting, see Boosting |
customSearchOperators |
Optional. Custom search operators which if specified will be used to filter results from workspace data stores. For more information on custom search operators, see SearchOperators. |
BoostSpec
| JSON representation |
|---|
{
"conditionBoostSpecs": [
{
object ( |
| Fields | |
|---|---|
conditionBoostSpecs[] |
Condition boost specifications. If a document matches multiple conditions in the specifications, boost scores from these specifications are all applied and combined in a non-linear way. Maximum number of specifications is 20. |
ConditionBoostSpec
| JSON representation |
|---|
{
"condition": string,
"boost": number,
"boostControlSpec": {
object ( |
| Fields | |
|---|---|
condition |
An expression which specifies a boost condition. The syntax and supported fields are the same as a filter expression. See Examples:
|
boost |
Strength of the condition boost, which should be in [-1, 1]. Negative boost means demotion. Default is 0.0. Setting to 1.0 gives the document a big promotion. However, it does not necessarily mean that the boosted document will be the top result at all times, nor that other documents will be excluded. Results could still be shown even when none of them matches the condition. And results that are significantly more relevant to the search query can still trump your heavily favored but irrelevant documents. Setting to -1.0 gives the document a big demotion. However, results that are deeply relevant might still be shown. The document will have an upstream battle to get a fairly high ranking, but it is not blocked out completely. Setting to 0.0 means no boost applied. The boosting condition is ignored. Only one of the (condition, boost) combination or the boost_control_spec below are set. If both are set then the global boost is ignored and the more fine-grained boost_control_spec is applied. |
boostControlSpec |
Complex specification for custom ranking based on customer defined attribute value. |
BoostControlSpec
| JSON representation |
|---|
{ "fieldName": string, "attributeType": enum ( |
| Fields | |
|---|---|
fieldName |
The name of the field whose value will be used to determine the boost amount. |
attributeType |
The attribute type to be used to determine the boost amount. The attribute value can be derived from the field value of the specified field_name. In the case of numerical it is straightforward i.e. attribute_value = numerical_field_value. In the case of freshness however, attribute_value = (time.now() - datetime_field_value). |
interpolationType |
The interpolation type to be applied to connect the control points listed below. |
controlPoints[] |
The control points used to define the curve. The monotonic function (defined through the interpolation_type above) passes through the control points listed here. |
ControlPoint
| JSON representation |
|---|
{ "attributeValue": string, "boostAmount": number } |
| Fields | |
|---|---|
attributeValue |
Can be one of: 1. The numerical field value. 2. The duration spec for freshness: The value must be formatted as an XSD |
boostAmount |
The value between -1 to 1 by which to boost the score if the attribute_value evaluates to the value specified above. |
UserInfo
| JSON representation |
|---|
{ "userId": string, "userAgent": string, "timeZone": string } |
| Fields | |
|---|---|
userId |
Highly recommended for logged-in users. Unique identifier for logged-in user, such as a user name. Don't set for anonymous users. Always use a hashed value for this ID. Don't set the field to the same fixed ID for different users. This mixes the event history of those users together, which results in degraded model quality. The field must be a UTF-8 encoded string with a length limit of 128 characters. Otherwise, an |
userAgent |
User agent as included in the HTTP header. The field must be a UTF-8 encoded string with a length limit of 1,000 characters. Otherwise, an This should not be set when using the client side event reporting with GTM or JavaScript tag in |
timeZone |
Optional. IANA time zone, e.g. Europe/Budapest. |
FacetSpec
| JSON representation |
|---|
{
"facetKey": {
object ( |
| Fields | |
|---|---|
facetKey |
Required. The facet key specification. |
limit |
Maximum facet values that are returned for this facet. If unspecified, defaults to 20. The maximum allowed value is 300. Values above 300 are coerced to 300. For aggregation in healthcare search, when the [FacetKey.key] is "healthcare_aggregation_key", the limit will be overridden to 10,000 internally, regardless of the value set here. If this field is negative, an |
excludedFilterKeys[] |
List of keys to exclude when faceting. By default, Listing a facet key in this field allows its values to appear as facet results, even when they are filtered out of search results. Using this field does not affect what search results are returned. For example, suppose there are 100 documents with the color facet "Red" and 200 documents with the color facet "Blue". A query containing the filter "color:ANY("Red")" and having "color" as If "color" is listed in "excludedFilterKeys", then the query returns the facet values "Red" with count 100 and "Blue" with count 200, because the "color" key is now excluded from the filter. Because this field doesn't affect search results, the search results are still correctly filtered to return only "Red" documents. A maximum of 100 values are allowed. Otherwise, an |
enableDynamicPosition |
Enables dynamic position for this facet. If set to true, the position of this facet among all facets in the response is determined automatically. If dynamic facets are enabled, it is ordered together. If set to false, the position of this facet in the response is the same as in the request, and it is ranked before the facets with dynamic position enable and all dynamic facets. For example, you may always want to have rating facet returned in the response, but it's not necessarily to always display the rating facet at the top. In that case, you can set enable_dynamic_position to true so that the position of rating facet in response is determined automatically. Another example, assuming you have the following facets in the request:
And also you have a dynamic facets enabled, which generates a facet |
FacetKey
| JSON representation |
|---|
{
"key": string,
"intervals": [
{
object ( |
| Fields | |
|---|---|
key |
Required. Supported textual and numerical facet keys in |
intervals[] |
Set only if values should be bucketed into intervals. Must be set for facets with numerical values. Must not be set for facet with text values. Maximum number of intervals is 30. |
restrictedValues[] |
Only get facet for the given restricted values. Only supported on textual fields. For example, suppose "category" has three values "Action > 2022", "Action > 2021" and "Sci-Fi > 2022". If set "restricted_values" to "Action > 2022", the "category" facet only contains "Action > 2022". Only supported on textual fields. Maximum is 10. |
prefixes[] |
Only get facet values that start with the given string prefix. For example, suppose "category" has three values "Action > 2022", "Action > 2021" and "Sci-Fi > 2022". If set "prefixes" to "Action", the "category" facet only contains "Action > 2022" and "Action > 2021". Only supported on textual fields. Maximum is 10. |
contains[] |
Only get facet values that contain the given strings. For example, suppose "category" has three values "Action > 2022", "Action > 2021" and "Sci-Fi > 2022". If set "contains" to "2022", the "category" facet only contains "Action > 2022" and "Sci-Fi > 2022". Only supported on textual fields. Maximum is 10. |
caseInsensitive |
True to make facet keys case insensitive when getting faceting values with prefixes or contains; false otherwise. |
orderBy |
The order in which documents are returned. Allowed values are:
If not set, textual values are sorted in natural order; numerical intervals are sorted in the order given by |
Interval
| JSON representation |
|---|
{ // Union field |
| Fields | |
|---|---|
Union field This field must be not larger than max. Otherwise, an |
|
minimum |
Inclusive lower bound. |
exclusiveMinimum |
Exclusive lower bound. |
Union field This field must be not smaller than min. Otherwise, an |
|
maximum |
Inclusive upper bound. |
exclusiveMaximum |
Exclusive upper bound. |
ParamsEntry
| JSON representation |
|---|
{ "key": string, "value": value } |
| Fields | |
|---|---|
key |
|
value |
|
Value
| JSON representation |
|---|
{ // Union field |
| Fields | |
|---|---|
Union field kind. The kind of value. kind can be only one of the following: |
|
nullValue |
Represents a null value. |
numberValue |
Represents a double value. |
stringValue |
Represents a string value. |
boolValue |
Represents a boolean value. |
structValue |
Represents a structured value. |
listValue |
Represents a repeated |
Struct
| JSON representation |
|---|
{ "fields": { string: value, ... } } |
| Fields | |
|---|---|
fields |
Unordered map of dynamically typed values. An object containing a list of |
FieldsEntry
| JSON representation |
|---|
{ "key": string, "value": value } |
| Fields | |
|---|---|
key |
|
value |
|
ListValue
| JSON representation |
|---|
{ "values": [ value ] } |
| Fields | |
|---|---|
values[] |
Repeated field of dynamically typed values. |
QueryExpansionSpec
| JSON representation |
|---|
{
"condition": enum ( |
| Fields | |
|---|---|
condition |
The condition under which query expansion should occur. Default to |
pinUnexpandedResults |
Whether to pin unexpanded results. If this field is set to true, unexpanded products are always at the top of the search results, followed by the expanded results. |
SpellCorrectionSpec
| JSON representation |
|---|
{
"mode": enum ( |
| Fields | |
|---|---|
mode |
The mode under which spell correction replaces the original search query. Defaults to |
ContentSearchSpec
| JSON representation |
|---|
{ "snippetSpec": { object ( |
| Fields | |
|---|---|
snippetSpec |
If |
summarySpec |
If |
extractiveContentSpec |
If there is no extractive_content_spec provided, there will be no extractive answer in the search response. |
searchResultMode |
Specifies the search result mode. If unspecified, the search result mode defaults to |
chunkSpec |
Specifies the chunk spec to be returned from the search response. Only available if the |
SnippetSpec
| JSON representation |
|---|
{ "maxSnippetCount": integer, "referenceOnly": boolean, "returnSnippet": boolean } |
| Fields | |
|---|---|
maxSnippetCount |
[DEPRECATED] This field is deprecated. To control snippet return, use |
referenceOnly |
[DEPRECATED] This field is deprecated and will have no affect on the snippet. |
returnSnippet |
If |
SummarySpec
| JSON representation |
|---|
{ "summaryResultCount": integer, "includeCitations": boolean, "ignoreAdversarialQuery": boolean, "ignoreNonSummarySeekingQuery": boolean, "ignoreLowRelevantContent": boolean, "ignoreJailBreakingQuery": boolean, "multimodalSpec": { object ( |
| Fields | |
|---|---|
summaryResultCount |
The number of top results to generate the summary from. If the number of results returned is less than At most 10 results for documents mode, or 50 for chunks mode, can be used to generate a summary. The chunks mode is used when |
includeCitations |
Specifies whether to include citations in the summary. The default value is When this field is set to Example summary including citations: BigQuery is Google Cloud's fully managed and completely serverless enterprise data warehouse [1]. BigQuery supports all data types, works across clouds, and has built-in machine learning and business intelligence, all within a unified platform [2, 3]. The citation numbers refer to the returned search results and are 1-indexed. For example, [1] means that the sentence is attributed to the first search result. [2, 3] means that the sentence is attributed to both the second and third search results. |
ignoreAdversarialQuery |
Specifies whether to filter out adversarial queries. The default value is Google employs search-query classification to detect adversarial queries. No summary is returned if the search query is classified as an adversarial query. For example, a user might ask a question regarding negative comments about the company or submit a query designed to generate unsafe, policy-violating output. If this field is set to |
ignoreNonSummarySeekingQuery |
Specifies whether to filter out queries that are not summary-seeking. The default value is Google employs search-query classification to detect summary-seeking queries. No summary is returned if the search query is classified as a non-summary seeking query. For example, |
ignoreLowRelevantContent |
Specifies whether to filter out queries that have low relevance. The default value is If this field is set to |
ignoreJailBreakingQuery |
Optional. Specifies whether to filter out jail-breaking queries. The default value is Google employs search-query classification to detect jail-breaking queries. No summary is returned if the search query is classified as a jail-breaking query. A user might add instructions to the query to change the tone, style, language, content of the answer, or ask the model to act as a different entity, e.g. "Reply in the tone of a competing company's CEO". If this field is set to |
multimodalSpec |
Optional. Multimodal specification. |
modelPromptSpec |
If specified, the spec will be used to modify the prompt provided to the LLM. |
languageCode |
Language code for Summary. Use language tags defined by BCP47. Note: This is an experimental feature. |
modelSpec |
If specified, the spec will be used to modify the model specification provided to the LLM. |
useSemanticChunks |
If true, answer will be generated from most relevant chunks from top search results. This feature will improve summary quality. Note that with this feature enabled, not all top search results will be referenced and included in the reference list, so the citation source index only points to the search results listed in the reference list. |
MultiModalSpec
| JSON representation |
|---|
{
"imageSource": enum ( |
| Fields | |
|---|---|
imageSource |
Optional. Source of image returned in the answer. |
ModelPromptSpec
| JSON representation |
|---|
{ "preamble": string } |
| Fields | |
|---|---|
preamble |
Text at the beginning of the prompt that instructs the assistant. Examples are available in the user guide. |
ModelSpec
| JSON representation |
|---|
{ "version": string } |
| Fields | |
|---|---|
version |
The model version used to generate the summary. Supported values are:
|
ExtractiveContentSpec
| JSON representation |
|---|
{ "maxExtractiveAnswerCount": integer, "maxExtractiveSegmentCount": integer, "returnExtractiveSegmentScore": boolean, "numPreviousSegments": integer, "numNextSegments": integer } |
| Fields | |
|---|---|
maxExtractiveAnswerCount |
The maximum number of extractive answers returned in each search result. An extractive answer is a verbatim answer extracted from the original document, which provides a precise and contextually relevant answer to the search query. If the number of matching answers is less than the At most five answers are returned for each |
maxExtractiveSegmentCount |
The max number of extractive segments returned in each search result. Only applied if the An extractive segment is a text segment extracted from the original document that is relevant to the search query, and, in general, more verbose than an extractive answer. The segment could then be used as input for LLMs to generate summaries and answers. If the number of matching segments is less than |
returnExtractiveSegmentScore |
Specifies whether to return the confidence score from the extractive segments in each search result. This feature is available only for new or allowlisted data stores. To allowlist your data store, contact your Customer Engineer. The default value is |
numPreviousSegments |
Specifies whether to also include the adjacent from each selected segments. Return at most |
numNextSegments |
Return at most |
ChunkSpec
| JSON representation |
|---|
{ "numPreviousChunks": integer, "numNextChunks": integer } |
| Fields | |
|---|---|
numPreviousChunks |
The number of previous chunks to be returned of the current chunk. The maximum allowed value is 3. If not specified, no previous chunks will be returned. |
numNextChunks |
The number of next chunks to be returned of the current chunk. The maximum allowed value is 3. If not specified, no next chunks will be returned. |
EmbeddingSpec
| JSON representation |
|---|
{
"embeddingVectors": [
{
object ( |
| Fields | |
|---|---|
embeddingVectors[] |
The embedding vector used for retrieval. Limit to 1. |
EmbeddingVector
| JSON representation |
|---|
{ "fieldPath": string, "vector": [ number ] } |
| Fields | |
|---|---|
fieldPath |
Embedding field path in schema. |
vector[] |
Query embedding vector. |
UserLabelsEntry
| JSON representation |
|---|
{ "key": string, "value": string } |
| Fields | |
|---|---|
key |
|
value |
|
NaturalLanguageQueryUnderstandingSpec
| JSON representation |
|---|
{ "filterExtractionCondition": enum ( |
| Fields | |
|---|---|
filterExtractionCondition |
The condition under which filter extraction should occur. Server behavior defaults to |
geoSearchQueryDetectionFieldNames[] |
Field names used for location-based filtering, where geolocation filters are detected in natural language search queries. Only valid when the FilterExtractionCondition is set to If this field is set, it overrides the field names set in |
extractedFilterBehavior |
Optional. Controls behavior of how extracted filters are applied to the search. The default behavior depends on the request. For single datastore structured search, the default is |
allowedFieldNames[] |
Optional. Allowlist of fields that can be used for natural language filter extraction. By default, if this is unspecified, all indexable fields are eligible for natural language filter extraction (but are not guaranteed to be used). If any fields are specified in allowed_field_names, only the fields that are both marked as indexable in the schema and specified in the allowlist will be eligible for natural language filter extraction. Note: for multi-datastore search, this is not yet supported, and will be ignored. |
SearchAsYouTypeSpec
| JSON representation |
|---|
{
"condition": enum ( |
| Fields | |
|---|---|
condition |
The condition under which search as you type should occur. Default to |
CustomFineTuningSpec
| JSON representation |
|---|
{ "enableSearchAdaptor": boolean } |
| Fields | |
|---|---|
enableSearchAdaptor |
Whether or not to enable and include custom fine tuned search adaptor model. |
DisplaySpec
| JSON representation |
|---|
{
"matchHighlightingCondition": enum ( |
| Fields | |
|---|---|
matchHighlightingCondition |
The condition under which match highlighting should occur. |
CrowdingSpec
| JSON representation |
|---|
{
"field": string,
"maxCount": integer,
"mode": enum ( |
| Fields | |
|---|---|
field |
The field to use for crowding. Documents can be crowded by a field in the |
maxCount |
The maximum number of documents to keep per value of the field. Once there are at least max_count previous results which contain the same value for the given field (according to the order specified in |
mode |
Mode to use for documents that are crowded away. |
SessionSpec
| JSON representation |
|---|
{ "queryId": string, // Union field |
| Fields | |
|---|---|
queryId |
If set, the search result gets stored to the "turn" specified by this query ID. Example: Let's say the session looks like this: session { name: ".../sessions/xxx" turns { query { text: "What is foo?" query_id: ".../questions/yyy" } answer: "Foo is ..." } turns { query { text: "How about bar then?" query_id: ".../questions/zzz" } } } The user can call /search API with a request like this: session: ".../sessions/xxx" session_spec { query_id: ".../questions/zzz" } Then, the API stores the search result, associated with the last turn. The stored search result can be used by a subsequent /answer API call (with the session ID and the query ID specified). Also, it is possible to call /search and /answer in parallel with the same session ID & query ID. |
Union field
|
|
searchResultPersistenceCount |
The number of top search results to persist. The persisted search results can be used for the subsequent /answer api call. This field is similar to the At most 10 results for documents mode, or 50 for chunks mode. |
RelevanceFilterSpec
| JSON representation |
|---|
{ "keywordSearchThreshold": { object ( |
| Fields | |
|---|---|
keywordSearchThreshold |
Optional. Relevance filtering threshold specification for keyword search. |
semanticSearchThreshold |
Optional. Relevance filtering threshold specification for semantic search. |
RelevanceThresholdSpec
| JSON representation |
|---|
{ // Union field |
| Fields | |
|---|---|
Union field relevance_threshold_spec. Configures how the relevance threshold is determined. relevance_threshold_spec can be only one of the following: |
|
relevanceThreshold |
Pre-defined relevance threshold for the sub-search. |
semanticRelevanceThreshold |
Custom relevance threshold for the sub-search. The value must be in [0.0, 1.0]. |
PersonalizationSpec
| JSON representation |
|---|
{
"mode": enum ( |
| Fields | |
|---|---|
mode |
The personalization mode of the search request. Defaults to |
RelevanceScoreSpec
| JSON representation |
|---|
{ "returnRelevanceScore": boolean } |
| Fields | |
|---|---|
returnRelevanceScore |
Optional. Whether to return the relevance score for search results. The higher the score, the more relevant the document is to the query. |
SearchAddonSpec
| JSON representation |
|---|
{ "disableSemanticAddOn": boolean, "disableKpiPersonalizationAddOn": boolean, "disableGenerativeAnswerAddOn": boolean } |
| Fields | |
|---|---|
disableSemanticAddOn |
Optional. If true, semantic add-on is disabled. Semantic add-on includes embeddings and jetstream. |
disableKpiPersonalizationAddOn |
Optional. If true, disables event re-ranking and personalization to optimize KPIs & personalize results. |
disableGenerativeAnswerAddOn |
Optional. If true, generative answer add-on is disabled. Generative answer add-on includes natural language to filters and simple answers. |
Output Schema
Response message for SearchService.Search method.
SearchResponse
| JSON representation |
|---|
{ "results": [ { object ( |
| Fields | |
|---|---|
results[] |
A list of matched documents. The order represents the ranking. |
facets[] |
Results of facets requested by user. |
guidedSearchResult |
Guided search result. |
totalSize |
The estimated total count of matched items irrespective of pagination. The count of |
attributionToken |
A unique search token. This should be included in the |
redirectUri |
The URI of a customer-defined redirect page. If redirect action is triggered, no search is performed, and only |
nextPageToken |
A token that can be sent as |
correctedQuery |
Contains the spell corrected query, if found. If the spell correction type is AUTOMATIC, then the search results are based on corrected_query. Otherwise the original query is used for search. |
suggestedQuery |
Corrected query with low confidence, AKA did you mean query. Compared with corrected_query, this field is set when SpellCorrector returned a response, but FPR(full page replacement) is not triggered because the corrction is of low confidence(eg, reversed because there are matches of the original query in document corpus). |
summary |
A summary as part of the search results. This field is only returned if |
appliedControls[] |
Controls applied as part of the Control service. |
geoSearchDebugInfo[] |
|
queryExpansionInfo |
Query expansion information for the returned results. |
naturalLanguageQueryUnderstandingInfo |
Output only. Natural language query understanding information for the returned results. |
sessionInfo |
Session information. Only set if |
oneBoxResults[] |
A list of One Box results. There can be multiple One Box results of different types. |
searchLinkPromotions[] |
Promotions for site search. |
semanticState |
Output only. Indicates the semantic state of the search response. |
SearchResult
| JSON representation |
|---|
{ "id": string, "document": { object ( |
| Fields | |
|---|---|
id |
|
document |
The document data snippet in the search response. Only fields that are marked as |
chunk |
The chunk data in the search response if the |
modelScores |
Output only. Google provided available scores. An object containing a list of |
rankSignals |
Optional. A set of ranking signals associated with the result. |
Document
| JSON representation |
|---|
{ "name": string, "id": string, "schemaId": string, "content": { object ( |
| Fields | |
|---|---|
name |
Immutable. The full resource name of the document. Format: This field must be a UTF-8 encoded string with a length limit of 1024 characters. |
id |
Immutable. The identifier of the document. Id should conform to RFC-1034 standard with a length limit of 128 characters. |
schemaId |
The identifier of the schema located in the same data store. |
content |
The unstructured data linked to this document. Content can only be set and must be set if this document is under a |
parentDocumentId |
The identifier of the parent document. Currently supports at most two level document hierarchy. Id should conform to RFC-1034 standard with a length limit of 63 characters. |
derivedStructData |
Output only. This field is OUTPUT_ONLY. It contains derived data that are not in the original input document. |
aclInfo |
Access control information for the document. |
indexTime |
Output only. The last time the document was indexed. If this field is set, the document could be returned in search results. This field is OUTPUT_ONLY. If this field is not populated, it means the document has never been indexed. Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: |
indexStatus |
Output only. The index status of the document.
|
Union field data. Data representation. One of struct_data or json_data should be provided otherwise an INVALID_ARGUMENT error is thrown. data can be only one of the following: |
|
structData |
The structured JSON data for the document. It should conform to the registered |
jsonData |
The JSON string representation of the document. It should conform to the registered |
Struct
| JSON representation |
|---|
{ "fields": { string: value, ... } } |
| Fields | |
|---|---|
fields |
Unordered map of dynamically typed values. An object containing a list of |
FieldsEntry
| JSON representation |
|---|
{ "key": string, "value": value } |
| Fields | |
|---|---|
key |
|
value |
|
Value
| JSON representation |
|---|
{ // Union field |
| Fields | |
|---|---|
Union field kind. The kind of value. kind can be only one of the following: |
|
nullValue |
Represents a null value. |
numberValue |
Represents a double value. |
stringValue |
Represents a string value. |
boolValue |
Represents a boolean value. |
structValue |
Represents a structured value. |
listValue |
Represents a repeated |
ListValue
| JSON representation |
|---|
{ "values": [ value ] } |
| Fields | |
|---|---|
values[] |
Repeated field of dynamically typed values. |
Content
| JSON representation |
|---|
{ "mimeType": string, // Union field |
| Fields | |
|---|---|
mimeType |
The MIME type of the content. Supported types:
The following types are supported only if layout parser is enabled in the data store:
See https://www.iana.org/assignments/media-types/media-types.xhtml. |
Union field content. The content of the unstructured document. content can be only one of the following: |
|
rawBytes |
The content represented as a stream of bytes. The maximum length is 1,000,000 bytes (1 MB / ~0.95 MiB). Note: As with all A base64-encoded string. |
uri |
The URI of the content. Only Cloud Storage URIs (e.g. |
AclInfo
| JSON representation |
|---|
{
"readers": [
{
object ( |
| Fields | |
|---|---|
readers[] |
Readers of the document. |
AccessRestriction
| JSON representation |
|---|
{
"principals": [
{
object ( |
| Fields | |
|---|---|
principals[] |
List of principals. |
idpWide |
All users within the Identity Provider. |
Principal
| JSON representation |
|---|
{ // Union field |
| Fields | |
|---|---|
Union field principal. Union field principal. Principal can be a user or a group. principal can be only one of the following: |
|
userId |
User identifier. For Google Workspace user account, user_id should be the google workspace user email. For non-google identity provider user account, user_id is the mapped user identifier configured during the workforcepool config. |
groupId |
Group identifier. For Google Workspace user account, group_id should be the google workspace group email. For non-google identity provider user account, group_id is the mapped group identifier configured during the workforcepool config. |
externalEntityId |
For 3P application identities which are not present in the customer identity provider. |
Timestamp
| JSON representation |
|---|
{ "seconds": string, "nanos": integer } |
| Fields | |
|---|---|
seconds |
Represents seconds of UTC time since Unix epoch 1970-01-01T00:00:00Z. Must be between -62135596800 and 253402300799 inclusive (which corresponds to 0001-01-01T00:00:00Z to 9999-12-31T23:59:59Z). |
nanos |
Non-negative fractions of a second at nanosecond resolution. This field is the nanosecond portion of the duration, not an alternative to seconds. Negative second values with fractions must still have non-negative nanos values that count forward in time. Must be between 0 and 999,999,999 inclusive. |
IndexStatus
| JSON representation |
|---|
{
"indexTime": string,
"errorSamples": [
{
object ( |
| Fields | |
|---|---|
indexTime |
The time when the document was indexed. If this field is populated, it means the document has been indexed. Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: |
errorSamples[] |
A sample of errors encountered while indexing the document. If this field is populated, the document is not indexed due to errors. |
pendingMessage |
Immutable. The message indicates the document index is in progress. If this field is populated, the document index is pending. |
Status
| JSON representation |
|---|
{ "code": integer, "message": string, "details": [ { "@type": string, field1: ..., ... } ] } |
| Fields | |
|---|---|
code |
The status code, which should be an enum value of |
message |
A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the |
details[] |
A list of messages that carry the error details. There is a common set of message types for APIs to use. An object containing fields of an arbitrary type. An additional field |
Any
| JSON representation |
|---|
{ "typeUrl": string, "value": string } |
| Fields | |
|---|---|
typeUrl |
A URL/resource name that uniquely identifies the type of the serialized protocol buffer message. This string must contain at least one "/" character. The last segment of the URL's path must represent the fully qualified name of the type (as in In practice, teams usually precompile into the binary all types that they expect it to use in the context of Any. However, for URLs which use the scheme
Note: this functionality is not currently available in the official protobuf release, and it is not used for type URLs beginning with type.googleapis.com. As of May 2023, there are no widely used type server implementations and no plans to implement one. Schemes other than |
value |
Must be a valid serialized protocol buffer of the above specified type. A base64-encoded string. |
Chunk
| JSON representation |
|---|
{ "name": string, "id": string, "content": string, "documentMetadata": { object ( |
| Fields | |
|---|---|
name |
The full resource name of the chunk. Format: This field must be a UTF-8 encoded string with a length limit of 1024 characters. |
id |
Unique chunk ID of the current chunk. |
content |
Content is a string from a document (parsed content). |
documentMetadata |
Metadata of the document from the current chunk. |
derivedStructData |
Output only. This field is OUTPUT_ONLY. It contains derived data that are not in the original input document. |
pageSpan |
Page span of the chunk. |
chunkMetadata |
Output only. Metadata of the current chunk. |
dataUrls[] |
Output only. Image Data URLs if the current chunk contains images. Data URLs are composed of four parts: a prefix (data:), a MIME type indicating the type of data, an optional base64 token if non-textual, and the data itself: data:[ |
annotationContents[] |
Output only. Annotation contents if the current chunk contains annotations. |
annotationMetadata[] |
Output only. The annotation metadata includes structured content in the current chunk. |
Union field
|
|
relevanceScore |
Output only. Represents the relevance score based on similarity. Higher score indicates higher chunk relevance. The score is in range [-1.0, 1.0]. Only populated on |
DocumentMetadata
| JSON representation |
|---|
{ "uri": string, "title": string, "mimeType": string, "structData": { object } } |
| Fields | |
|---|---|
uri |
Uri of the document. |
title |
Title of the document. |
mimeType |
The mime type of the document. https://www.iana.org/assignments/media-types/media-types.xhtml. |
structData |
Data representation. The structured JSON data for the document. It should conform to the registered |
PageSpan
| JSON representation |
|---|
{ "pageStart": integer, "pageEnd": integer } |
| Fields | |
|---|---|
pageStart |
The start page of the chunk. |
pageEnd |
The end page of the chunk. |
ChunkMetadata
| JSON representation |
|---|
{ "previousChunks": [ { object ( |
| Fields | |
|---|---|
previousChunks[] |
The previous chunks of the current chunk. The number is controlled by |
nextChunks[] |
The next chunks of the current chunk. The number is controlled by |
AnnotationMetadata
| JSON representation |
|---|
{
"structuredContent": {
object ( |
| Fields | |
|---|---|
structuredContent |
Output only. The structured content information. |
imageId |
Output only. Image id is provided if the structured content is based on an image. |
StructuredContent
| JSON representation |
|---|
{
"structureType": enum ( |
| Fields | |
|---|---|
structureType |
Output only. The structure type of the structured content. |
content |
Output only. The content of the structured content. |
ModelScoresEntry
| JSON representation |
|---|
{
"key": string,
"value": {
object ( |
| Fields | |
|---|---|
key |
|
value |
|
DoubleList
| JSON representation |
|---|
{ "values": [ number ] } |
| Fields | |
|---|---|
values[] |
Double values. |
RankSignals
| JSON representation |
|---|
{ "defaultRank": number, "customSignals": [ { object ( |
| Fields | |
|---|---|
defaultRank |
Optional. The default rank of the result. |
customSignals[] |
Optional. A list of custom clearbox signals. |
Union field
|
|
keywordSimilarityScore |
Optional. Keyword matching adjustment. |
Union field
|
|
relevanceScore |
Optional. Semantic relevance adjustment. |
Union field
|
|
semanticSimilarityScore |
Optional. Semantic similarity adjustment. |
Union field
|
|
pctrRank |
Optional. Predicted conversion rate adjustment as a rank. |
Union field
|
|
topicalityRank |
Optional. Topicality adjustment as a rank. |
Union field
|
|
documentAge |
Optional. Age of the document in hours. |
Union field
|
|
boostingFactor |
Optional. Combined custom boosts for a doc. |
CustomSignal
| JSON representation |
|---|
{ "name": string, "value": number } |
| Fields | |
|---|---|
name |
Optional. Name of the signal. |
value |
Optional. Float value representing the ranking signal (e.g. 1.25 for BM25). |
Facet
| JSON representation |
|---|
{
"key": string,
"values": [
{
object ( |
| Fields | |
|---|---|
key |
The key for this facet. For example, |
values[] |
The facet values for this field. |
dynamicFacet |
Whether the facet is dynamically generated. |
FacetValue
| JSON representation |
|---|
{ "count": string, // Union field |
| Fields | |
|---|---|
count |
Number of items that have this facet value. |
Union field facet_value. A facet value which contains values. facet_value can be only one of the following: |
|
value |
Text value of a facet, such as "Black" for facet "colors". |
interval |
Interval value for a facet, such as [10, 20) for facet "price". It matches |
Interval
| JSON representation |
|---|
{ // Union field |
| Fields | |
|---|---|
Union field This field must be not larger than max. Otherwise, an |
|
minimum |
Inclusive lower bound. |
exclusiveMinimum |
Exclusive lower bound. |
Union field This field must be not smaller than min. Otherwise, an |
|
maximum |
Inclusive upper bound. |
exclusiveMaximum |
Exclusive upper bound. |
GuidedSearchResult
| JSON representation |
|---|
{
"refinementAttributes": [
{
object ( |
| Fields | |
|---|---|
refinementAttributes[] |
A list of ranked refinement attributes. |
followUpQuestions[] |
Suggested follow-up questions. |
RefinementAttribute
| JSON representation |
|---|
{ "attributeKey": string, "attributeValue": string } |
| Fields | |
|---|---|
attributeKey |
Attribute key used to refine the results. For example, |
attributeValue |
Attribute value used to refine the results. For example, |
Summary
| JSON representation |
|---|
{ "summaryText": string, "summarySkippedReasons": [ enum ( |
| Fields | |
|---|---|
summaryText |
The summary content. |
summarySkippedReasons[] |
Additional summary-skipped reasons. This provides the reason for ignored cases. If nothing is skipped, this field is not set. |
safetyAttributes |
A collection of Safety Attribute categories and their associated confidence scores. |
summaryWithMetadata |
Summary with metadata information. |
SafetyAttributes
| JSON representation |
|---|
{ "categories": [ string ], "scores": [ number ] } |
| Fields | |
|---|---|
categories[] |
The display names of Safety Attribute categories associated with the generated content. Order matches the Scores. |
scores[] |
The confidence scores of the each category, higher value means higher confidence. Order matches the Categories. |
SummaryWithMetadata
| JSON representation |
|---|
{ "summary": string, "citationMetadata": { object ( |
| Fields | |
|---|---|
summary |
Summary text with no citation information. |
citationMetadata |
Citation metadata for given summary. |
references[] |
Document References. |
blobAttachments[] |
Output only. Store multimodal data for answer enhancement. |
CitationMetadata
| JSON representation |
|---|
{
"citations": [
{
object ( |
| Fields | |
|---|---|
citations[] |
Citations for segments. |
Citation
| JSON representation |
|---|
{
"startIndex": string,
"endIndex": string,
"sources": [
{
object ( |
| Fields | |
|---|---|
startIndex |
Index indicates the start of the segment, measured in bytes/unicode. |
endIndex |
End of the attributed segment, exclusive. |
sources[] |
Citation sources for the attributed segment. |
CitationSource
| JSON representation |
|---|
{ "referenceIndex": string } |
| Fields | |
|---|---|
referenceIndex |
Document reference index from SummaryWithMetadata.references. It is 0-indexed and the value will be zero if the reference_index is not set explicitly. |
Reference
| JSON representation |
|---|
{
"title": string,
"document": string,
"uri": string,
"chunkContents": [
{
object ( |
| Fields | |
|---|---|
title |
Title of the document. |
document |
Required. |
uri |
Cloud Storage or HTTP uri for the document. |
chunkContents[] |
List of cited chunk contents derived from document content. |
ChunkContent
| JSON representation |
|---|
{ "content": string, "pageIdentifier": string, "blobAttachmentIndexes": [ string ] } |
| Fields | |
|---|---|
content |
Chunk textual content. |
pageIdentifier |
Page identifier. |
blobAttachmentIndexes[] |
Output only. Stores indexes of blobattachments linked to this chunk. |
BlobAttachment
| JSON representation |
|---|
{ "data": { object ( |
| Fields | |
|---|---|
data |
Output only. The blob data. |
attributionType |
Output only. The attribution type of the blob. |
Blob
| JSON representation |
|---|
{ "mimeType": string, "data": string } |
| Fields | |
|---|---|
mimeType |
Output only. The media type (MIME type) of the generated data. |
data |
Output only. Raw bytes. A base64-encoded string. |
GeoSearchDebugInfo
| JSON representation |
|---|
{ "originalAddressQuery": string, "errorMessage": string } |
| Fields | |
|---|---|
originalAddressQuery |
The address from which forward geocoding ingestion produced issues. |
errorMessage |
The error produced. |
QueryExpansionInfo
| JSON representation |
|---|
{ "expandedQuery": boolean, "pinnedResultCount": string } |
| Fields | |
|---|---|
expandedQuery |
Bool describing whether query expansion has occurred. |
pinnedResultCount |
Number of pinned results. This field will only be set when expansion happens and |
NaturalLanguageQueryUnderstandingInfo
| JSON representation |
|---|
{
"extractedFilters": string,
"rewrittenQuery": string,
"classifiedIntents": [
string
],
"structuredExtractedFilter": {
object ( |
| Fields | |
|---|---|
extractedFilters |
The filters that were extracted from the input query. |
rewrittenQuery |
Rewritten input query minus the extracted filters. |
classifiedIntents[] |
The classified intents from the input query. |
structuredExtractedFilter |
The filters that were extracted from the input query represented in a structured form. |
StructuredExtractedFilter
| JSON representation |
|---|
{
"expression": {
object ( |
| Fields | |
|---|---|
expression |
The expression denoting the filter that was extracted from the input query in a structured form. It can be a simple expression denoting a single string, numerical or geolocation constraint or a compound expression which is a combination of multiple expressions connected using logical (OR and AND) operators. |
Expression
| JSON representation |
|---|
{ // Union field |
| Fields | |
|---|---|
Union field expr. The expression type. expr can be only one of the following: |
|
stringConstraint |
String constraint expression. |
numberConstraint |
Numerical constraint expression. |
geolocationConstraint |
Geolocation constraint expression. |
andExpr |
Logical "And" compound operator connecting multiple expressions. |
orExpr |
Logical "Or" compound operator connecting multiple expressions. |
StringConstraint
| JSON representation |
|---|
{ "fieldName": string, "values": [ string ], "querySegment": string } |
| Fields | |
|---|---|
fieldName |
Name of the string field as defined in the schema. |
values[] |
Values of the string field. The record will only be returned if the field value matches one of the values specified here. |
querySegment |
Identifies the keywords within the search query that match a filter. |
NumberConstraint
| JSON representation |
|---|
{
"fieldName": string,
"comparison": enum ( |
| Fields | |
|---|---|
fieldName |
Name of the numerical field as defined in the schema. |
comparison |
The comparison operation performed between the field value and the value specified in the constraint. |
value |
The value specified in the numerical constraint. |
querySegment |
Identifies the keywords within the search query that match a filter. |
GeolocationConstraint
| JSON representation |
|---|
{ "fieldName": string, "address": string, "latitude": number, "longitude": number, "radiusInMeters": number } |
| Fields | |
|---|---|
fieldName |
The name of the geolocation field as defined in the schema. |
address |
The reference address that was inferred from the input query. The proximity of the reference address to the geolocation field will be used to filter the results. |
latitude |
The latitude of the geolocation inferred from the input query. |
longitude |
The longitude of the geolocation inferred from the input query. |
radiusInMeters |
The radius in meters around the address. The record is returned if the location of the geolocation field is within the radius. |
AndExpression
| JSON representation |
|---|
{
"expressions": [
{
object ( |
| Fields | |
|---|---|
expressions[] |
The expressions that were ANDed together. |
OrExpression
| JSON representation |
|---|
{
"expressions": [
{
object ( |
| Fields | |
|---|---|
expressions[] |
The expressions that were ORed together. |
SessionInfo
| JSON representation |
|---|
{ "name": string, "queryId": string } |
| Fields | |
|---|---|
name |
Name of the session. If the auto-session mode is used (when |
queryId |
Query ID that corresponds to this search API call. One session can have multiple turns, each with a unique query ID. By specifying the session name and this query ID in the Answer API call, the answer generation happens in the context of the search results from this search call. |
OneBoxResult
| JSON representation |
|---|
{ "oneBoxType": enum ( |
| Fields | |
|---|---|
oneBoxType |
The type of One Box result. |
searchResults[] |
The search results for this One Box. |
SearchLinkPromotion
| JSON representation |
|---|
{ "title": string, "uri": string, "document": string, "imageUri": string, "description": string, "enabled": boolean } |
| Fields | |
|---|---|
title |
Required. The title of the promotion. Maximum length: 160 characters. |
uri |
Optional. The URL for the page the user wants to promote. Must be set for site search. For other verticals, this is optional. |
document |
Optional. The |
imageUri |
Optional. The promotion thumbnail image url. |
description |
Optional. The Promotion description. Maximum length: 200 characters. |
enabled |
Optional. The enabled promotion will be returned for any serving configs associated with the parent of the control this promotion is attached to. This flag is used for basic site search only. |
Tool Annotations
Destructive Hint: ❌ | Idempotent Hint: ✅ | Read Only Hint: ✅ | Open World Hint: ❌