Vertex AI documentation is no longer being updated

Vertex AI's services are now part of Gemini Enterprise Agent Platform. See the most up-to-date information in the Agent Platform documentation.

Managing Spanner mode

In Spanner deployment mode, Vertex AI RAG Engine uses RagManagedDb, which is an enterprise-ready, fully managed Google Cloud Spanner instance that's used for resource storage by Vertex AI RAG Engine. Optionally use it as the vector database of choice for your RAG corpora.

Through Spanner, Vertex AI RAG Engine offers a consistent, highly available, and highly scalable, dedicated database to support your application. To learn more about Google Cloud Spanner, see Spanner.

Data storage and vector search

Vertex AI RAG Engine stores your RAG corpus and RAG file resource metadata in RagManagedDb, regardless of your choice of vector database. Vector databases are only used for storage and retrieval of embeddings. In addition to resource storage, RagManagedDb can also be used to store and manage vector representations of your documents. The vector database is then used to retrieve relevant documents based on the document's semantic similarity to a given query.

Available tiers

Vertex AI RAG Engine lets you scale your RagManagedDb instance based on your usage and performance requirements using a choice of two tiers. You can also use it to delete your Vertex AI RAG Engine data with a third tier.

The tier is a project-level setting that's available in the RagEngineConfig resource that impacts RAG corpora using RagManagedDb. The following tiers are available in RagEngineConfig:

Scaled tier: This tier offers production-scale performance along with autoscaling functionality. It's suitable for customers with large amounts of data or performance-sensitive workloads. Internally, this tier sets the Spanner instance to autoscaling configuration with a minimum of 1 node (1,000 processing units) and a maximum of 10 nodes (10,000 processing units).
Basic tier (default): This tier offers a cost-effective and low-compute tier, which might be suitable for some of the following cases:
- Experimenting with RagManagedDb
- Small data sizes
- Latency-insensitive workloads
- Using Vertex AI RAG Engine with only other vector databases

To offer the Basic tier, RagManagedDb sets the underlying Spanner instance to a fixed configuration of 100 processing units, which is equivalent to 0.1 nodes.

Unprovisioned tier: This tier deletes the RagManagedDb and its underlying Spanner instance. The Unprovisioned tier disables the Vertex AI RAG Engine service and deletes your data held within this service regardless of the vector database used for your RagCorpora. This stops the billing of the service. For more information on billing, see Vertex AI RAG Engine billing.

After the data is deleted, it can't be recovered. To start using Vertex AI RAG Engine again, you must update the tier by calling the UpdateRagEngineConfig API or switch the mode to Serverless.

Managing tiers

To read and update your tiers, use the GetRagEngineConfig and UpdateRagEngineConfig API. Refer to the Switching between modes page for code samples on how to use these APIs.