VertexRanker is a semantic reranking layer powered by the Ranking API.
After Vector Search 2.0 retrieves and fuses candidate results
(using Reciprocal Rank Fusion (RRF)), VertexRanker rescores the merged
candidates against your natural-language query using a dedicated semantic model.
This improves the relevance of the top-k results, especially for queries where
search alone misses nuance. Reranking happens server-side with calls to
BatchSearchDataObjects. No additional client-side wiring or additional
round trips are needed.
VertexRanker
VertexRanker is configured as the reranker for the combine.ranker field of the
BatchSearchDataObjectsRequest object. You must configure a primary ranker
(only RRF is supported). The reranker runs after the RRF fusion step and
replaces the output with the semantically rescored list.
The following table lists ranker configuration fields.
| Field | Required | Description |
|---|---|---|
combine.ranker.rrf.weights |
Yes | The weights for the RRF fusion of the underlying search results. |
combine.ranker.vertex_ranker.model |
Yes | The ranking model name. Only semantic-ranker-fast@latest is supported. |
combine.ranker.vertex_ranker.top_n |
Yes | The maximum number of candidates from the fused list to send to the ranker. Valid values are from 0 to 200. |
combine.ranker.vertex_ranker.text_record_spec.query |
Yes | The natural language query used by the ranker to score records. |
combine.ranker.vertex_ranker.text_record_spec.title_template |
Yes | The template string, for example, {title}, specifying how to extract (construct) the title for each records. Use dot-paths to specify data fields. |
combine.ranker.vertex_ranker.text_record_spec.content_template |
Yes | The template string, for example, {body.text}, specifying how to extract (construct) the main content for each record. Use dot-paths to specify data fields. |
combine.top_k |
Optional | The final number of results to return after reranking. This must be less than or equal to vertex_ranker.top_n and less than or equal to 200. |
The following example demonstrates the body of a request.
{
"searches": [
{ "semantic_search": { "search_text": "running shoes", "search_field": "embedding", "task_type": "RETRIEVAL_QUERY", "top_k": 50 } },
{ "text_search": { "search_text": "running shoes", "data_field_names": ["title"], "top_k": 50 } }
],
"combine": {
"top_k": 10,
"ranker": {
"rrf": { "weights": [1.0, 1.0] },
"vertex_ranker": {
"model": "semantic-ranker-fast@latest",
"top_n": 50,
"text_record_spec": {
"query": "running shoes",
"title_template": "{title}",
"content_template": "{body.text}"
}
}
}
}
}
Templates use dot-paths to Data Object data fields, for example, {title}
or {nested.field}. These fields must exist in the Collection schema,
otherwise the request is rejected during validation. If a Data Object is
missing the provided field, the server transparently refetches the full record
from storage so the template can still be populated.
Quota
Free during preview: VertexRanker traffic is provided at no extra cost up to
300 queries per minute (QPM) for each underlying searches are fused. Requests
above the 300 QPM ceiling are rejected by the ranker and provided as a warning.
The call still returns the fused (RRF) results so search remains available.
The maximum number of records sent to the ranker per request is 200
(top_n must be in the range [1, 200] and top_k is less than or equal to
top_n).
Failure handling and warnings
VertexRanker is a best-effort reranker. If the ranking call fails, the
BatchSearchDataObjects RPC still succeeds and returns the RRF-fused results
truncated to top_k. The failure is reported in the
search_response_metadata.warnings field in the response. The full status code
from the Ranking API is preserved. The warning code for unexpected statuses is
UNAVAILABLE with the warning message:
"Reranking is temporarily unavailable. Returning fused (RRF) results without
semantic reranking."
Failure conditions
The entire BatchSearchDataObjects RPC fails with FAILED_PRECONDITION
(no fallback to RRF) in the following cases.
- Discovery Engine API not enabled — the Ranking API returns
FAILED_PRECONDITIONwith the messageDiscovery Engine API is not enabled for the consumer project /<N/>. Please enable the API and try again."Fix: enable the Discovery Engine API on the consumer project.
The customer must enable the Discovery Engine for the consumer project for reranking to occur.
The following table lists warning codes, the typical causes generating the warnings, and the message explaining the warning.
| Warning Code | Typical Cause | Warning Message |
|---|---|---|
RESOURCE_EXHAUSTED |
Exceeded the 300 QPM free quota on the consumer project. | <Quota exceeded message from SuperQuota>; RankService.Rank call failed for consumer project <N> with request query: <query> and <model>: semantic-ranker-fast@latest number of records: <N> |
DEADLINE_EXCEEDED |
Ranker.Rank did not complete within the request deadline. |
<deadline-exceeded message from Ranker.Rank>; RankService.Rank call failed for consumer project <N> with request query: <query> and <model>: semantic-ranker-fast@latest number of records: <N> |
UNAVAILABLE |
Either Ranker.Rank returned UNAVAILABLE, or it returned a non-preserved code (for example, INTERNAL) which is collapsed to UNAVAILABLE. |
If underlying code is UNAVAILABLE: "RankService.Rank call failed for consumer project... Otherwise: "Reranking is temporarily unavailable. Returning fused (RRF) results without semantic reranking." |
FAILED_PRECONDITION |
Any FAILED_PRECONDITION returned by the Ranking API or SuperQuota, excluding "Discovery Engine API is not enabled". |
RankService.Rank call failed for consumer project <N> with request query: <query> and <model>: semantic-ranker-fast@latest number of records: <N> |
CANCELLED |
The caller cancelled the BatchSearchDataObjects RPC before reranking completed. |
<cancellation message from the cancelled RankService.Rank RPC>; RankService.Rank call failed for consumer project <N> with request query: <query> and <model>: semantic-ranker-fast@latest number of records: <N> |