The Vertex AI RAG Engine is a component of the Vertex AI platform, which facilitates Retrieval-Augmented Generation (RAG). RAG Engine enables Large Language Models (LLMs) to access and incorporate data from external knowledge sources, such as documents and databases. By using RAG, LLMs can generate more accurate and informative LLM responses.
Parameters list
This section lists the following:
| Parameters | Examples |
|---|---|
| See Corpus management parameters. | See Corpus management examples. |
| See File management parameters. | See File management examples. |
| See Retrieval and prediction parameters. | See Retrieval query example. |
| See Project management parameters. | See Project management examples. |
Corpus management parameters
For information about a RAG corpus, see Corpus management.
Create a RAG corpus
This table lists the parameters used to create a RAG corpus.
Body Request
| Parameters | |
|---|---|
|
Optional: Immutable.
The configuration to specify the corpus type. |
|
Required: The display name of the RAG corpus. |
|
Optional: The description of the RAG corpus. |
|
Optional: Immutable: The CMEK key name is used to encrypt at-rest data that's related to the RAG corpus. The key name is only applicable to the Format: |
|
Optional: Immutable: The configuration for the vector databases. |
|
Optional: The configuration for the Vertex AI Search. Format: |
CorpusTypeConfig
| Parameters | |
|---|---|
|
The default value of |
|
If you set this type, the RAG corpus is a For more information, see Use Vertex AI RAG Engine as the memory store. |
|
The LLM parser that's used to parse and store session contexts from the Gemini Live API. You can build memories for indexing. |
RagVectorDbConfig
| Parameters | |
|---|---|
|
If no vector database is specified, |
|
Default. Finds the exact nearest neighbors by comparing all data points in your RAG corpus. If you don't specify a strategy during the creation of your RAG corpus, KNN is the default retrieval strategy used. |
|
Determines the number of layers or levels in the tree. If you haveO(10K) RAG files in the RAG corpus, set thi value to 2.
Determines the number of leaf nodes in the tree-based structure.
|
|
Specifies your Weaviate instance. |
|
The Weaviate instance's HTTP endpoint. This value can't be changed after it's set. You can leave it empty in
the |
|
The Weaviate collection that the RAG corpus maps to. This value can't be changed after it's set. You can leave it empty in
the |
|
Specifies your Pinecone instance. |
|
This is the name used to create the Pinecone index that's used with the RAG corpus. This value can't be changed after it's set. You can leave it empty in
the |
|
Specifies your Vertex AI Feature Store instance. |
|
The Vertex AI Feature Store Format: This value can't be changed after it's set. You can leave it empty in
the |
|
Specifies your Vertex Vector Search instance. |
|
This is the resource name of the Vector Search index that's used with the RAG corpus. Format: This value can't be changed after it's set. You can leave it empty in
the |
|
This is the resource name of the Vector Search index endpoint that's used with the RAG corpus. Format: This value can't be changed after it's set. You can leave it empty in
the |
|
This the full resource name of the secret that is stored in Secret Manager, which contains your Weaviate or Pinecone API key that depends on your choice of vector database. Format: You can leave it empty in the |
|
Optional: Immutable: The embedding model to use for the RAG corpus. This value can't be changed after it's set. If you leave it empty, we use text-embedding-005 as the embedding model. |
Update a RAG corpus
This table lists the parameters used to update a RAG corpus.
Body Request
| Parameters | |
|---|---|
|
Optional: The display name of the RAG corpus. |
|
Optional: The description of the RAG corpus. |
|
The Weaviate instance's HTTP endpoint. If your |
|
The Weaviate collection that the RAG corpus maps to. If your |
|
This is the name used to create the Pinecone index that's used with the RAG corpus. If your |
|
The Vertex AI Feature Store Format: If your |
|
This is the resource name of the Vector Search index that's used with the RAG corpus. Format: If your |
|
This is the resource name of the Vector Search index endpoint that's used with the RAG corpus. Format: If your |
|
The full resource name of the secret that is stored in Secret Manager, which contains your Weaviate or Pinecone API key depends on your choice of vector database. Format: |
List RAG corpora
This table lists the parameters used to list RAG corpora.
| Parameters | |
|---|---|
|
Optional: The standard list page size. |
|
Optional: The standard list page token. Typically obtained from |
Get a RAG corpus
This table lists parameters used to get a RAG corpus.
| Parameters | |
|---|---|
|
The name of the |
Delete a RAG corpus
This table lists parameters used to delete a RAG corpus.
| Parameters | |
|---|---|
|
The name of the |
Batch create metadata schemas
This table lists the parameters used to batch create metadata schemas for a RAG corpus.
Body Request
| Parameters | |
|---|---|
|
Required: list of The request messages for |
CreateRagDataSchemaRequest
| Parameters | |
|---|---|
|
Required: The metadata schema to create. |
RagDataSchema
| Parameters | |
|---|---|
|
Required: The key of the metadata schema. |
|
The details of the metadata schema. |
RagMetadataSchemaDetails
| Parameters | |
|---|---|
|
The data type of the metadata schema. Options: |
List metadata schemas
This table lists the parameters used to list metadata schemas.
| Parameters | |
|---|---|
|
Required: The resource name of the |
Batch delete metadata schemas
This table lists the parameters used to batch delete metadata schemas.
| Parameters | |
|---|---|
|
Required: list of The resource names of the |
File management parameters
For information about a RAG file and its metadata, see File management.
Upload a RAG file
This table lists parameters used to upload a RAG file.
Body Request
| Parameters | |
|---|---|
|
The name of the |
|
Required: The file to upload. |
|
Required: The configuration for the |
RagFile |
|
|---|---|
|
Required: The display name of the RAG file. |
|
Optional: The description of the RAG file. |
UploadRagFileConfig |
|
|---|---|
|
Number of tokens each chunk has. |
|
The overlap between chunks. |
Import RAG files
This table lists parameters used to import a RAG file.
| Parameters | |
|---|---|
|
Required: The name of the Format: |
|
Cloud Storage location. Supports importing individual files as well as entire Cloud Storage directories. |
|
Cloud Storage URI that contains the upload file. |
|
Google Drive location. Supports importing individual files as well as Google Drive folders. |
|
The slack channel where the file is uploaded. |
|
The Jira query where the file is uploaded. |
|
The SharePoint sources where the file is uploaded. |
|
Number of tokens each chunk has. |
|
The overlap between chunks. |
|
Optional: Specifies the parsing configuration for If this field isn't set, RAG uses the default parser. |
|
Optional: The maximum number of queries per minute that this job is allowed to make to the embedding model specified on the corpus. This value is specific to this job and not shared across other import jobs. Consult the Quotas page on the project to set an appropriate value. If unspecified, a default value of 1,000 QPM is used. |
GoogleDriveSource |
|
|---|---|
|
Required: The ID of the Google Drive resource. |
|
Required: The type of the Google Drive resource. |
SlackSource |
|
|---|---|
|
Repeated: Slack channel information, include ID and time range to import. |
|
Required: The Slack channel ID. |
|
Optional: The starting timestamp for messages to import. |
|
Optional: The ending timestamp for messages to import. |
|
Required: The full resource name of the secret that is stored in Secret Manager,
which contains a Slack channel access token that has access to the slack channel IDs.
Format: |
JiraSource |
|
|---|---|
|
Repeated: A list of Jira projects to import in their entirety. |
|
Repeated: A list of custom Jira queries to import. For information about JQL (Jira Query Language), see
|
|
Required: The Jira email address. |
|
Required: The Jira server URI. |
|
Required: The full resource name of the secret that is stored in Secret Manager,
which contains Jira API key that has access to the slack channel IDs.
Format: |
SharePointSources |
|
|---|---|
|
The path of the SharePoint folder to download from. |
|
The ID of the SharePoint folder to download from. |
|
The name of the drive to download from. |
|
The ID of the drive to download from. |
|
The Application ID for the app registered in Microsoft Azure Portal.
|
|
Required: The full resource name of the secret that is stored in Secret Manager, which contains the application secret for the app registered in Azure. Format: |
|
Unique identifier of the Azure Active Directory Instance. |
|
The name of the SharePoint site to download from. This can be the site name or the site id. |
RagFileParsingConfig |
|
|---|---|
|
The Layout Parser to use for |
|
The full resource name of a Document AI processor or processor version. Format:
|
|
The maximum number of requests the job is allowed to make to the Document AI processor per minute. Consult https://cloud.google.com/document-ai/quotas and the Quota page for your project to set an appropriate value here. If unspecified, a default value of 120 QPM is used. |
|
The LLM parser to use for |
|
The resource name of an LLM model. Format:
|
|
The maximum number of requests the job is allowed to make to the LLM model per minute. To set an appropriate value for your project, see model quota section and the Quota page for your project to set an appropriate value here. If unspecified, a default value of 5000 QPM is used. |
Get a RAG file
This table lists parameters used to get a RAG file.
| Parameters | |
|---|---|
|
The name of the |
Delete a RAG file
This table lists parameters used to delete a RAG file.
| Parameters | |
|---|---|
|
The name of the |
Batch create metadata
This table lists the parameters used to batch create metadata for a RAG file.
Body Request
| Parameters | |
|---|---|
|
Required: list of The request messages for |
CreateRagMetadataRequest
| Parameters | |
|---|---|
|
Required: The metadata to create. |
|
Optional: The ID to use for the metadata, which will become the final component of the metadata's resource name. |
RagMetadata
| Parameters | |
|---|---|
|
The metadata provided by users. |
UserSpecifiedMetadata
| Parameters | |
|---|---|
|
Required: The key of the metadata. The key must correspond to a key defined in a |
|
The value of the metadata. |
MetadataValue
| Parameters | |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
List metadata
This table lists the parameters used to list metadata for a RAG file.
| Parameters | |
|---|---|
|
Required: The resource name of the |
Update metadata
This table lists the parameters used to update metadata.
| Parameters | |
|---|---|
|
Required: The |
Batch delete metadata
This table lists the parameters used to batch delete metadata.
| Parameters | |
|---|---|
|
Required: list of The resource names of the |
Retrieval and prediction parameters
This section lists the retrieval and prediction parameters.
Retrieval parameters
This table lists parameters for retrieveContexts API.
| Parameters | |
|---|---|
|
Required: The resource name of the Location to retrieve Format: |
|
The data source for Vertex RagStore. |
|
Required: Single RAG retrieve query. |
VertexRagStore
VertexRagStore |
|
|---|---|
|
list: The representation of the RAG source. It can be used to specify the corpus
only or |
|
Optional:
Format: |
|
list: A list of Format: |
RagQuery |
|
|---|---|
|
The query in text format to get relevant contexts. |
|
Optional: The retrieval configuration for the query. |
RagRetrievalConfig |
|
|---|---|
|
Optional: The number of contexts to retrieve. |
|
Optional: Alpha value controls the weight between dense and sparse vector search results. The range is [0, 1], where 0 means sparse vector search only and 1 means dense vector search only. The default value is 0.5, which balances sparse and dense vector search equally. Hybrid Search is only available for Weaviate. |
|
Only returns contexts with a vector distance smaller than the threshold. |
|
Optional: The metadata filter to apply during retrieval, using Common Expression Language (CEL). For more information, see [Metadata search](/vertex-ai/generative-ai/docs/rag-engine/use-metadata-search). Example: |
|
Only returns contexts with vector similarity larger than the threshold. |
|
Optional: The model name of the rank service. Example: |
|
Optional: The model name used for ranking. Example: |
Prediction parameters
This table lists prediction parameters.
GenerateContentRequest |
|
|---|---|
|
Set to use a data source powered by Vertex AI RAG store. |
See VertexRagStore for details.
Project management parameters
This table lists project-level parameters.
RagEngineConfig
| Parameters | |
|---|---|
RagManagedDbConfig.serverless |
Sets/Switches the deployment mode to Serverless, providing a fully-managed and highly scalable database to back your RAG Engine resources. |
RagManagedDbConfig.spanner |
Sets/Switches the deployment mode to Spanner, backed by a production-ready Spanner instance. |
RagManagedDbConfig.spanner.scaled |
This tier offers production-scale performance along with autoscaling functionality under Spanner mode. |
RagManagedDbConfig.spanner.basic |
This tier offers a cost-effective and low-compute tier under Spanner mode. |
RagManagedDbConfig.spanner.unprovisioned |
This tier deletes the RagManagedDb and its underlying Spanner instance. |
Corpus management examples
This section provides examples of how to use the API to manage your RAG corpus.
Create a RAG corpus example
This code sample demonstrates how to create a RAG corpus.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- CORPUS_DISPLAY_NAME: The display name of the
RagCorpus. - CORPUS_DESCRIPTION: The description of the
RagCorpus.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora
Request JSON body:
{
"display_name" : "CORPUS_DISPLAY_NAME",
"description": "CORPUS_DESCRIPTION",
}
To send your request, choose one of these options:
curl
Save the request body in a file named request.json,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora"
PowerShell
Save the request body in a file named request.json,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora" | Select-Object -Expand Content
The following example demonstrates how to create a RAG corpus by using the REST API.
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
CORPUS_DISPLAY_NAME: The display name of the <code>RagCorpus</code>.
// CreateRagCorpus
// Input: LOCATION, PROJECT_ID, CORPUS_DISPLAY_NAME
// Output: CreateRagCorpusOperationMetadata
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora \
-d '{
"display_name" : "CORPUS_DISPLAY_NAME"
}'
Update a RAG corpus example
You can update your RAG corpus with a new display name, description, and vector database configuration. However, you can't change the following parameters in your RAG corpus:
- The vector database type. For example, you can't change the vector database from Weaviate to Vertex AI Feature Store.
- If you're using the managed database option, you can't update the vector database configuration.
These examples demonstrate how to update a RAG corpus.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- CORPUS_ID: The corpus ID of your RAG corpus.
- CORPUS_DISPLAY_NAME: The display name of the
RagCorpus. - CORPUS_DESCRIPTION: The description of the
RagCorpus. - INDEX_NAME: The resource name of the
Vector Search Index. Format:projects/{project}/locations/{location}/indexes/{index} - INDEX_ENDPOINT_NAME: The resource name of the
Vector Search Index Endpoint. Format:projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}
HTTP method and URL:
PATCH https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/CORPUS_ID
Request JSON body:
{
"display_name" : "CORPUS_DISPLAY_NAME",
"description": "CORPUS_DESCRIPTION",
"rag_vector_db_config": {
"vertex_vector_search": {
"index": "INDEX_NAME",
"index_endpoint": "INDEX_ENDPOINT_NAME",
}
}
}
To send your request, choose one of these options:
curl
Save the request body in a file named request.json,
and execute the following command:
curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/CORPUS_ID"
PowerShell
Save the request body in a file named request.json,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method PATCH `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/CORPUS_ID" | Select-Object -Expand Content
List RAG corpora example
This code sample demonstrates how to list all of the RAG corpora.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- PAGE_SIZE: The standard list page size. You may adjust the number of
RagCorporato return per page by updating thepage_sizeparameter. - PAGE_TOKEN: The standard list page token. Obtained typically using
ListRagCorporaResponse.next_page_tokenof the previousVertexRagDataService.ListRagCorporacall.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content
RagCorpora under the given PROJECT_ID.
Get a RAG corpus example
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpusresource.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID" | Select-Object -Expand Content
RagCorpus resource.
The get and list commands are used in an example to demonstrate how
RagCorpus uses the rag_embedding_model_config field with in the vector_db_config, which points to the
embedding model you have chosen.
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
RAG_CORPUS_ID: The corpus ID of your RAG corpus.
// GetRagCorpus
// Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID
// Output: RagCorpus
curl -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID
// ListRagCorpora
curl -sS -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/
Delete a RAG corpus example
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpusresource.
HTTP method and URL:
DELETE https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID
To send your request, choose one of these options:
curl
Execute the following command:
curl -X DELETE \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method DELETE `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID" | Select-Object -Expand Content
DeleteOperationMetadata.
Batch create metadata schemas example
This code sample demonstrates how to batch create metadata schemas for a RAG corpus.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpusresource. - SCHEMA_KEY_1: The key for the first metadata schema.
- SCHEMA_TYPE_1: The data type for the first metadata schema (e.g.,
INTEGER). - SCHEMA_KEY_2: The key for the second metadata schema.
- SCHEMA_TYPE_2: The data type for the second metadata schema (e.g.,
STRING).
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragDataSchemas:batchCreate
Request JSON body:
{
"requests": [
{
"rag_data_schema": {
"key": "SCHEMA_KEY_1",
"schema_details": {"type": "SCHEMA_TYPE_1"}
}
},
{
"rag_data_schema": {
"key": "SCHEMA_KEY_2",
"schema_details": {"type": "SCHEMA_TYPE_2"}
}
}
]
}
To send your request, choose one of these options:
curl
Save the request body in a file named request.json,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragDataSchemas:batchCreate"
PowerShell
Save the request body in a file named request.json,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragDataSchemas:batchCreate" | Select-Object -Expand Content
List metadata schemas example
This code sample demonstrates how to list metadata schemas for a RAG corpus.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpusresource.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragDataSchemas
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragDataSchemas"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragDataSchemas" | Select-Object -Expand Content
RagDataSchema resources.
Batch delete metadata schemas example
This code sample demonstrates how to batch delete metadata schemas.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpusresource. - SCHEMA_ID_1: The ID of the first metadata schema to delete.
- SCHEMA_ID_2: The ID of the second metadata schema to delete.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragDataSchemas:batchDelete
Request JSON body:
{
"names": [
"projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragDataSchemas/SCHEMA_ID_1",
"projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragDataSchemas/SCHEMA_ID_2"
]
}
To send your request, choose one of these options:
curl
Save the request body in a file named request.json,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragDataSchemas:batchDelete"
PowerShell
Save the request body in a file named request.json,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragDataSchemas:batchDelete" | Select-Object -Expand Content
File management examples
This section provides examples of how to use the API to manage RAG files.
Upload a RAG file example
REST
Before using any of the request data, make the following replacements: PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
RAG_CORPUS_ID: The corpus ID of your RAG corpus.
LOCAL_FILE_PATH: The local path to the file to be uploaded.
DISPLAY_NAME: The display name of the RAG file.
DESCRIPTION: The description of the RAG file.
To send your request, use the following command:
curl -X POST \
-H "X-Goog-Upload-Protocol: multipart" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-F metadata="{'rag_file': {'display_name':' DISPLAY_NAME', 'description':'DESCRIPTION'}}" \
-F file=@LOCAL_FILE_PATH \
"https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload"
Import RAG files example
Files and folders can be imported from Drive or Cloud Storage.
The response.skipped_rag_files_count refers to the number of files that
were skipped during import. A file is skipped when the following conditions are
met:
- The file has already been imported.
- The file hasn't changed.
- The chunking configuration for the file hasn't changed.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpusresource. - GCS_URIS: A list of Cloud Storage locations. Example:
gs://my-bucket1, gs://my-bucket2. - CHUNK_SIZE: Optional: Number of tokens each chunk should have.
- CHUNK_OVERLAP: Optional: Number of tokens overlap between chunks.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import
Request JSON body:
{
"import_rag_files_config": {
"gcs_source": {
"uris": "GCS_URIS"
},
"rag_file_chunking_config": {
"chunk_size": CHUNK_SIZE,
"chunk_overlap": CHUNK_OVERLAP
}
}
}
To send your request, choose one of these options:
curl
Save the request body in a file named request.json,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import"
PowerShell
Save the request body in a file named request.json,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import" | Select-Object -Expand Content
ImportRagFilesOperationMetadata resource.
The following sample demonstrates how to import a file from
Cloud Storage. Use the max_embedding_requests_per_min control field
to limit the rate at which RAG Engine calls the embedding model during the
ImportRagFiles indexing process. The field has a default value of 1000 calls
per minute.
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
RAG_CORPUS_ID: The corpus ID of your RAG corpus.
GCS_URIS: A list of Cloud Storage locations. Example: gs://my-bucket1.
CHUNK_SIZE: Number of tokens each chunk should have.
CHUNK_OVERLAP: Number of tokens overlap between chunks.
EMBEDDING_MODEL_QPM_RATE: The QPM rate to limit RAGs access to your embedding model. Example: 1000.
// ImportRagFiles
// Import a single Cloud Storage file or all files in a Cloud Storage bucket.
// Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID, GCS_URIS
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import \
-d '{
"import_rag_files_config": {
"gcs_source": {
"uris": "GCS_URIS"
},
"rag_file_chunking_config": {
"chunk_size": CHUNK_SIZE,
"chunk_overlap": CHUNK_OVERLAP
},
"max_embedding_requests_per_min": EMBEDDING_MODEL_QPM_RATE
}
}'
// Poll the operation status.
// The response contains the number of files imported.
OPERATION_ID: The operation ID you get from the response of the previous command.
poll_op_wait OPERATION_ID
The following sample demonstrates how to import a file from
Drive. Use the max_embedding_requests_per_min control field to
limit the rate at which RAG Engine calls the embedding model during the
ImportRagFiles indexing process. The field has a default value of 1000 calls
per minute.
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
RAG_CORPUS_ID: The corpus ID of your RAG corpus.
FOLDER_RESOURCE_ID: The resource ID of your Google Drive folder.
CHUNK_SIZE: Number of tokens each chunk should have.
CHUNK_OVERLAP: Number of tokens overlap between chunks.
EMBEDDING_MODEL_QPM_RATE: The QPM rate to limit RAGs access to your embedding model. Example: 1000.
// ImportRagFiles
// Import all files in a Google Drive folder.
// Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID, FOLDER_RESOURCE_ID
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import \
-d '{
"import_rag_files_config": {
"google_drive_source": {
"resource_ids": {
"resource_id": "FOLDER_RESOURCE_ID",
"resource_type": "RESOURCE_TYPE_FOLDER"
}
},
"max_embedding_requests_per_min": EMBEDDING_MODEL_QPM_RATE
}
}'
// Poll the operation status.
// The response contains the number of files imported.
OPERATION_ID: The operation ID you get from the response of the previous command.
poll_op_wait OPERATION_ID
List RAG files example
This code sample demonstrates how to list RAG files.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpusresource. - PAGE_SIZE: The standard list page size. You may adjust the number of
RagFilesto return per page by updating thepage_sizeparameter. - PAGE_TOKEN: The standard list page token. Obtained typically using
ListRagFilesResponse.next_page_tokenof the previousVertexRagDataService.ListRagFilescall.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content
RagFiles under the given RAG_CORPUS_ID.
Get a RAG file example
This code sample demonstrates how to get a RAG file.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpusresource. - RAG_FILE_ID: The ID of the
RagFileresource.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content
RagFile resource.
Delete a RAG file example
This code sample demonstrates how to delete a RAG file.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpusresource. - RAG_FILE_ID: The ID of the
RagFileresource. Format:projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file_id}.
HTTP method and URL:
DELETE https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID
To send your request, choose one of these options:
curl
Execute the following command:
curl -X DELETE \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method DELETE `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content
DeleteOperationMetadata resource.
Batch create metadata example
This code sample demonstrates how to batch create metadata for a RAG file.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpusresource. - RAG_FILE_ID: The ID of the
RagFileresource. - METADATA_KEY_1: The key for the first metadata entry.
- VALUE_TYPE_1: The value type field for the first metadata entry (e.g.,
int_value). - METADATA_VALUE_1: The value for the first metadata entry.
- METADATA_KEY_2: The key for the second metadata entry.
- VALUE_TYPE_2: The value type field for the second metadata entry (e.g.,
str_value). - METADATA_VALUE_2: The value for the second metadata entry.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID/ragMetadata:batchCreate
Request JSON body:
{
"requests": [
{
"rag_metadata": {
"user_specified_metadata": {
"key": "METADATA_KEY_1",
"value": { "VALUE_TYPE_1": METADATA_VALUE_1 }
}
}
},
{
"rag_metadata": {
"user_specified_metadata": {
"key": "METADATA_KEY_2",
"value": { "VALUE_TYPE_2": "METADATA_VALUE_2" }
}
}
}
]
}
To send your request, choose one of these options:
curl
Save the request body in a file named request.json,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID/ragMetadata:batchCreate"
PowerShell
Save the request body in a file named request.json,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID/ragMetadata:batchCreate" | Select-Object -Expand Content
List metadata example
This code sample demonstrates how to list metadata for a RAG file.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpusresource. - RAG_FILE_ID: The ID of the
RagFileresource.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID/ragMetadata
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID/ragMetadata"
PowerShell
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID/ragMetadata" | Select-Object -Expand Content
RagMetadata resources.
Update metadata example
This code sample demonstrates how to update metadata for a RAG file.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpusresource. - RAG_FILE_ID: The ID of the
RagFileresource. - METADATA_ID: The ID of the metadata entry to update.
- METADATA_KEY: The key for the metadata entry.
- VALUE_TYPE: The value type field (e.g.,
int_value). - METADATA_VALUE: The new value for the metadata entry.
HTTP method and URL:
PATCH https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID/ragMetadata/METADATA_ID
Request JSON body:
{
"user_specified_metadata": {
"key": "METADATA_KEY",
"value": { "VALUE_TYPE": METADATA_VALUE }
}
}
To send your request, choose one of these options:
curl
Save the request body in a file named request.json,
and execute the following command:
curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID/ragMetadata/METADATA_ID"
PowerShell
Save the request body in a file named request.json,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method PATCH `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID/ragMetadata/METADATA_ID" | Select-Object -Expand Content
Batch delete metadata example
This code sample demonstrates how to batch delete metadata entries for a RAG file.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpusresource. - RAG_FILE_ID: The ID of the
RagFileresource. - METADATA_ID_1: The ID of the first metadata entry to delete.
- METADATA_ID_2: The ID of the second metadata entry to delete.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID/ragMetadata:batchDelete
Request JSON body:
{
"names": [
"projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID/ragMetadata/METADATA_ID_1",
"projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID/ragMetadata/METADATA_ID_2"
]
}
To send your request, choose one of these options:
curl
Save the request body in a file named request.json,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID/ragMetadata:batchDelete"
PowerShell
Save the request body in a file named request.json,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID/ragMetadata:batchDelete" | Select-Object -Expand Content
Retrieval query example
When a user asks a question or provides a prompt, the retrieval component in RAG searches through its knowledge base to find information that is relevant to the query.
REST
Before using any of the request data, make the following replacements:
- LOCATION: The region to process the request.
- PROJECT_ID: Your project ID.
- RAG_CORPUS_RESOURCE: The name of the
RagCorpusresource. Format:projects/{project}/locations/{location}/ragCorpora/{rag_corpus}. - TOP_K: The number of top contexts to retrieve.
- VECTOR_DISTANCE_THRESHOLD: Only contexts with a vector distance smaller than the threshold are returned.
- METADATA_FILTER: Optional: The metadata filter to apply during retrieval.
- TEXT: The query text to get relevant contexts.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts
Request JSON body:
{
"vertex_rag_store": {
"rag_resources": [
{
"rag_corpus": "RAG_CORPUS_RESOURCE"
}
]
},
"query": {
"text": "TEXT",
"rag_retrieval_config": {
"top_k": TOP_K,
"filter": {
"vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD,
"metadata_filter": "METADATA_FILTER"
}
}
}
}
To send your request, choose one of these options:
curl
Save the request body in a file named request.json,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts"
PowerShell
Save the request body in a file named request.json,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts" | Select-Object -Expand Content
RagFiles.
Generation example
The LLM generates a grounded response using the retrieved contexts.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- MODEL_ID: LLM model for content generation. Example:
gemini-2.5-flash - GENERATION_METHOD: LLM method for content generation. Options:
generateContent,streamGenerateContent - INPUT_PROMPT: The text sent to the LLM for content generation. Try to use a prompt relevant to the uploaded rag Files.
- RAG_CORPUS_RESOURCE: The name of the
RagCorpusresource. Format:projects/{project}/locations/{location}/ragCorpora/{rag_corpus}. - TOP_K: Optional: The number of top contexts to retrieve.
- VECTOR_DISTANCE_THRESHOLD: Optional: Contexts with a vector distance smaller than the threshold are returned.
- METADATA_FILTER: Optional: The metadata filter to apply during retrieval.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD
Request JSON body:
{
"contents": {
"role": "user",
"parts": {
"text": "INPUT_PROMPT"
}
},
"tools": {
"retrieval": {
"disable_attribution": false,
"vertex_rag_store": {
"rag_resources": [
{
"rag_corpus": "RAG_CORPUS_RESOURCE"
}
],
"rag_retrieval_config": {
"top_k": TOP_K,
"filter": {
"vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD,
"metadata_filter": "METADATA_FILTER"
}
}
}
}
}
}
To send your request, choose one of these options:
curl
Save the request body in a file named request.json,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD"
PowerShell
Save the request body in a file named request.json,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD" | Select-Object -Expand Content
Project management examples
The deployment mode and tier is a project-level setting available under the RagEngineConfig
resource and impacts RAG corpora using RagManagedDb. To get the current
configuration, use GetRagEngineConfig. To update the configuration,
use UpdateRagEngineConfig.
For more information on managing your mode and tier configuration, see Deployment modes in RAG Engine.
Read your current RagEngineConfig
The following code samples demonstrate how to read your RagEngineConfig to see what mode and tier is currently chosen:
Console
- In the Google Cloud console, go to the RAG Engine page.
- Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
- Click Configure RAG Engine. The Configure RAG Engine pane appears. You can see the tier that's selected for your RAG Engine.
- Click Cancel.
REST
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
curl -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragEngineConfig
Python
from vertexai.preview import rag
import vertexai
PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)
rag_engine_config = rag.rag_data.get_rag_engine_config(
name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"
)
print(rag_engine_config)
Switch to Serverless mode
The following code samples demonstrate how to switch your RagEngineConfig to the Serverless mode:
Console
- In the Google Cloud console, go to the RAG Engine page.
- Select the region in which your Vertex AI RAG Engine is running.
- Click the Switch to Serverless button. This button might not be visible if you are already on Serverless mode. You can verify your current mode from the mode label at the top right section of the page.
REST
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
curl -X PATCH \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragEngineConfig -d "{'ragManagedDbConfig': {'serverless': {}}}"
Python
from vertexai.preview import rag
import vertexai
PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)
rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"
new_rag_engine_config = rag.RagEngineConfig(
name=rag_engine_config_name,
rag_managed_db_config=rag.RagManagedDbConfig(mode=rag.Serverless()),
)
updated_rag_engine_config = rag.rag_data.update_rag_engine_config(
rag_engine_config=new_rag_engine_config
)
print(updated_rag_engine_config)
Switch to Spanner mode
The following code samples demonstrate how to switch your RagEngineConfig to the Spanner mode. If you previously have used Spanner mode, and have chosen a tier, you no longer need to provide it explicitly while switching. If not, refer to the lower code examples on how to switch to Spanner mode while providing a tier.
Console
- In the Google Cloud console, go to the RAG Engine page.
- Select the region in which your Vertex AI RAG Engine is running.
- Click the Switch to Spanner button. This button might not be visible if you are already on Spanner mode. You can verify your current mode from the mode label at the top right section of the page.
REST
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
curl -X PATCH \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragEngineConfig -d "{'ragManagedDbConfig': {'spanner': {}}}"
Python
from vertexai.preview import rag
import vertexai
PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)
rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"
new_rag_engine_config = rag.RagEngineConfig(
name=rag_engine_config_name,
rag_managed_db_config=rag.RagManagedDbConfig(mode=rag.Spanner()),
)
updated_rag_engine_config = rag.rag_data.update_rag_engine_config(
rag_engine_config=new_rag_engine_config
)
print(updated_rag_engine_config)
Update your RagEngineConfig to Spanner mode Scaled tier
The following code samples demonstrate how to set the RagEngineConfig to the Spanner mode with Scaled tier:
Console
- In the Google Cloud console, go to the RAG Engine page.
- Select the region in which your Vertex AI RAG Engine is running.
- Click the Switch to Spanner button if not already on Spanner mode.
- Click Configure RAG Engine. The Configure RAG Engine pane appears.
- Select the tier that you want to run your RAG Engine.
- Click Save.
REST
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
curl -X PATCH \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragEngineConfig -d "{'ragManagedDbConfig': {'spanner': {'scaled': {}}}}"
Python
from vertexai.preview import rag
import vertexai
PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)
rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"
new_rag_engine_config = rag.RagEngineConfig(
name=rag_engine_config_name,
rag_managed_db_config=rag.RagManagedDbConfig(mode=rag.Spanner(tier=rag.Scaled())),
)
updated_rag_engine_config = rag.rag_data.update_rag_engine_config(
rag_engine_config=new_rag_engine_config
)
print(updated_rag_engine_config)
Update your RagEngineConfig to Spanner mode with Basic tier
The following code samples demonstrate how to set the RagEngineConfig to the Spanner mode with Basic tier:
Console
- In the Google Cloud console, go to the RAG Engine page.
- Select the region in which your Vertex AI RAG Engine is running.
- Click the Switch to Spanner button if not already on Spanner mode.
- Click Configure RAG Engine. The Configure RAG Engine pane appears.
- Select the tier that you want to run your RAG Engine.
- Click Save.
REST
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
curl -X PATCH \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragEngineConfig -d "{'ragManagedDbConfig': {'spanner': {'basic': {}}}}"
Python
from vertexai.preview import rag
import vertexai
PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)
rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"
new_rag_engine_config = rag.RagEngineConfig(
name=rag_engine_config_name,
rag_managed_db_config=rag.RagManagedDbConfig(mode=rag.Spanner(tier=rag.Basic())),
)
updated_rag_engine_config = rag.rag_data.update_rag_engine_config(
rag_engine_config=new_rag_engine_config
)
print(updated_rag_engine_config)
Update your RagEngineConfig to Unprovisioned tier
The following code samples demonstrate how to set the RagEngineConfig to the Spanner mode with Unprovisioned tier. This will permanently delete all the data from your Spanner deployment mode and halt billing expenses arising from it.
Console
- In the Google Cloud console, go to the RAG Engine page.
- Select the region in which your Vertex AI RAG Engine is running.
- Click the Switch to Spanner button if not already on Spanner mode.
- Click Delete RAG Engine. A confirmation dialog appears.
- Verify that you're about to delete your data in Vertex AI RAG Engine by typing delete, then click Confirm.
- Click Save.
REST
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
curl -X PATCH \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragEngineConfig -d "{'ragManagedDbConfig': {'spanner': {'unprovisioned': {}}}}"
Python
from vertexai.preview import rag
import vertexai
PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)
rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"
new_rag_engine_config = rag.RagEngineConfig(
name=rag_engine_config_name,
rag_managed_db_config=rag.RagManagedDbConfig(mode=rag.Spanner(tier=rag.Unprovisioned())),
)
updated_rag_engine_config = rag.rag_data.update_rag_engine_config(
rag_engine_config=new_rag_engine_config
)
print(updated_rag_engine_config)
What's next
- To learn more about supported generation models, see Generative AI models that support RAG.
- To learn more about supported embedding models, see Embedding models.
- To learn more about open models, see Open models.
- To learn more about RAG Engine, see RAG Engine overview.