Generative AI with RAG

Last reviewed 2025-09-22 UTC

Use the following architecture guides to design and deploy generative AI applications with retrieval-augmented generation (RAG) in Google Cloud.

Architecture guide Description
RAG infrastructure for generative AI using Gemini Enterprise and Agent Platform An agent-driven architecture that uses Gemini Enterprise as a unified platform to orchestrate an end-to-end RAG dataflow for enterprise applications that require real-time data availability and enriched contextual search.
RAG infrastructure for generative AI using Agent Platform and Vector Search A fully managed, serverless architecture that provides optimized, high-performance vector search for large-scale applications.
RAG infrastructure for generative AI using Agent Platform and AlloyDB for PostgreSQL An architecture that stores vector embeddings alongside your operational data in a fully managed database like AlloyDB for PostgreSQL.
RAG infrastructure for generative AI using GKE and Cloud SQL A flexible, container-based architecture that provides maximum control to build custom applications with open source tools such as Ray, Hugging Face, and LangChain.
GraphRAG infrastructure for generative AI using Agent Platform and Spanner Graph An advanced RAG architecture that combines vector search with knowledge graph queries to retrieve interconnected, contextual data, which results in more detailed and relevant generative AI responses.
Harness CI/CD pipeline for RAG applications An architecture for a continuous integration (CI) and continuous deployment (CD) pipeline for a RAG application in Google Cloud.