Generative AI architecture guides

Last reviewed 2025-01-06 UTC

This document in the Architecture Center provides architecture guides to help you build and deploy generative AI workloads in Google Cloud.

To learn how to set up, deploy, evaluate, and operate generative AI applications on foundation models, see Deploy and operate generative AI applications.

High-level architectures

The following guides provide high-level architectures for specific business and technical uses cases of generative AI:

Guide	Description
Generate personalized marketing campaigns	Generate media assets for personalized marketing campaigns.
Generate personalized product recommendations	Generate personalized product recommendations based on user preferences for retail applications.
Generate podcasts from audio files	Generate podcasts based on media files such as live commentary for a sports event.
Generate solutions for customer support requests	Generate responses to customer questions, such as technology support requests.

Reference architectures

The following guides provide detailed architecture examples and design recommendations to deploy generative AI workloads and infrastructure for specific use cases:

Guide	Description
Automate utilization-review of insurance claims	Improve the prior authorization (PA) and utilization review (UR) process for health insurance claims.
RAG infrastructure using Gemini Enterprise and Vertex AI	Orchestrate an agentic RAG workflow with real-time data availability and enriched contextual search.
RAG infrastructure using Vertex AI and Vector Search	Provide optimized, high-performance vector search for large-scale applications.
RAG infrastructure using Vertex AI and AlloyDB for PostgreSQL	Store vector embeddings alongside operational data in a fully managed AlloyDB for PostgreSQL database.
RAG infrastructure using Vertex AI and Cloud SQL	Stores vector embeddings alongside operational data in a fully managed Cloud SQL database.
RAG infrastructure using GKE and Cloud SQL	Build custom RAG applications by using open source tools such as Ray, Hugging Face, and LangChain.
GraphRAG infrastructure using Vertex AI and Spanner Graph	Combine vector search with knowledge graph queries for retrieval of interconnected contextual data.
Private connectivity for RAG-capable generative AI applications	Secure the network infrastructure for RAG-capable generative AI applications by using Shared VPC.
Harness CI/CD pipeline for RAG applications	Set up a continuous integration (CI) and continuous deployment (CD) pipeline for RAG applications.

Generative AI architecture guides Stay organized with collections Save and categorize content based on your preferences.

High-level architectures

Reference architectures

Generative AI architecture guides