Generative AI use case: Generate personalized product recommendations

Last reviewed 2025-12-15 UTC

This document describes a high-level architecture for using AI to generate personalized product recommendations for a retail application in Google Cloud.

The intended audience for this document includes architects, developers, and administrators who build and manage generative AI applications in the cloud for the retail industry. The document assumes that you have a foundational understanding of generative AI.

Architecture

The following diagram shows an architecture that uses an AI model to generate personalized product recommendations based on insights from clickstream metrics.

Architecture for using AI to generate personalized product recommendations.

The architecture shows the following flows:

Ingest and process user data:
1. Clickstream data such as page views, clicks, and purchases are uploaded to a Dataflow pipeline.
2. Dataflow processes the data and derives insights such as user profiles and preferences. Dataflow then stores the data, insights, and vector embeddings in BigQuery.
Generate and serve product recommendations:
1. A customer visits the company's storefront, which is a Cloud Run service in this architecture.
2. The storefront service sends the visitor's data to a recommender service that runs on Cloud Run.
3. The recommender service performs a vector similarity search in BigQuery and retrieves data about the visitor's profile and preferences.
4. The recommender service sends the visitor's profile and preferences data to Gemini API in Vertex AI, with a prompt to generate product recommendations. Gemini generates product recommendations that are tailored for the visitor.
5. The recommender service sends the product recommendations to the storefront service, which then displays the recommendations.

To optimize cost and performance, add a cache between the storefront service and the recommender service. The recommender service checks the cache for visitor data. If the cache doesn't contain relevant data, the service performs a vector similarity search in BigQuery. To set up the cache, you can use Memorystore or configure a load balancer with Cloud CDN.

Products used

This example architecture uses the following Google Cloud products:

Cloud Run: A serverless compute platform that lets you run containers directly on top of Google's scalable infrastructure.
Vertex AI: An ML platform that lets you train and deploy ML models and AI applications, and customize LLMs for use in AI-powered applications.
BigQuery: An enterprise data warehouse that helps you manage and analyze your data with built-in features like machine learning geospatial analysis, and business intelligence.
Dataflow: A service that provides unified stream and batch data processing at scale.

Deployment

To experiment with generative AI applications in Google Cloud for retail workloads, use the following code samples:

What's next

Explore more generative AI architecture guides.
For an overview of architectural principles and recommendations that are specific to AI and ML workloads in Google Cloud, see the AI and ML perspective in the Well-Architected Framework.
For more reference architectures, diagrams, and best practices, explore the Cloud Architecture Center.

Contributors

Author: Kumar Dhanagopal | Cross-Product Solution Developer

Other contributors:

Amina Mansour | Head of Cloud Platform Evaluations Team
Megan O'Keefe | Developer Advocate
Samantha He | Technical Writer
Shir Meir Lador | Developer Relations Engineering Manager

Generative AI use case: Generate personalized product recommendations Stay organized with collections Save and categorize content based on your preferences.