This document in the Architecture Center provides architecture guides to help you build and deploy generative AI workloads in Google Cloud.
To learn how to set up, deploy, evaluate, and operate generative AI applications on foundation models, see Deploy and operate generative AI applications.
High-level architectures
The following guides provide high-level architectures for specific business and technical uses cases of generative AI:
| Guide | Description |
|---|---|
| Generate personalized marketing campaigns | Generate media assets for personalized marketing campaigns. |
| Generate personalized product recommendations | Generate personalized product recommendations based on user preferences for retail applications. |
| Generate podcasts from audio files | Generate podcasts based on media files such as live commentary for a sports event. |
| Generate solutions for customer support requests | Generate responses to customer questions, such as technology support requests. |
Reference architectures
The following guides provide detailed architecture examples and design recommendations to deploy generative AI workloads and infrastructure for specific use cases:
| Guide | Description |
|---|---|
| Automate utilization-review of insurance claims | Improve the prior authorization (PA) and utilization review (UR) process for health insurance claims. |
| RAG infrastructure using Gemini Enterprise and Vertex AI | Orchestrate an agentic RAG workflow with real-time data availability and enriched contextual search. |
| RAG infrastructure using Vertex AI and Vector Search | Provide optimized, high-performance vector search for large-scale applications. |
| RAG infrastructure using Vertex AI and AlloyDB for PostgreSQL | Store vector embeddings alongside operational data in a fully managed AlloyDB for PostgreSQL database. |
| RAG infrastructure using Vertex AI and Cloud SQL | Stores vector embeddings alongside operational data in a fully managed Cloud SQL database. |
| RAG infrastructure using GKE and Cloud SQL | Build custom RAG applications by using open source tools such as Ray, Hugging Face, and LangChain. |
| GraphRAG infrastructure using Vertex AI and Spanner Graph | Combine vector search with knowledge graph queries for retrieval of interconnected contextual data. |
| Private connectivity for RAG-capable generative AI applications | Secure the network infrastructure for RAG-capable generative AI applications by using Shared VPC. |
| Harness CI/CD pipeline for RAG applications | Set up a continuous integration (CI) and continuous deployment (CD) pipeline for RAG applications. |