This document describes a high-level architecture for an application that runs a data science workflow to automate complex data analytics and machine learning tasks.
This architecture uses datasets that are hosted in BigQuery or AlloyDB for PostgreSQL. The architecture is a multi-agent system that lets users run actions in natural language commands and it eliminates the need to write complex SQL or Python code.
The intended audience for this document includes architects, developers, and administrators who build and manage agentic AI applications. This architecture lets business and data teams analyze metrics across a wide range of industries, such as retail, finance, and manufacturing. The document assumes a foundational understanding of agentic AI systems. For information about how agents differ from non-agentic systems, see What is the difference between AI agents, AI assistants, and bots?
The deployment section of this document provides links to code samples to help you experiment with deploying an agentic AI application that runs a data science workflow.
Architecture
The following diagram shows the architecture for a data science workflow agent.
This architecture includes the following components:
| Component | Description |
|---|---|
| Frontend | Users interact with the multi-agent system through a frontend, such as a chat interface, that runs as a serverless Cloud Run service. |
| Agents | This architecture uses the following agents:
|
| Agents runtime | The AI agents in this architecture are deployed as serverless Cloud Run services. |
| ADK | ADK provides tools and a framework to develop, test, and deploy agents. ADK abstracts the complexity of agent creation and lets AI developers focus on the agent's logic and capabilities. |
| AI model and model runtimes | For inference serving, the agents in this example architecture use the latest Gemini model on Vertex AI. |
Products used
This example architecture uses the following Google Cloud and open-source products and tools:
- Cloud Run: A serverless compute platform that lets you run containers directly on top of Google's scalable infrastructure.
- Agent Development Kit (ADK): A set of tools and libraries to develop, test, and deploy AI agents.
- Vertex AI: An ML platform that lets you train and deploy ML models and AI applications, and customize LLMs for use in AI-powered applications.
- Gemini: A family of multimodal AI models developed by Google.
- BigQuery: An enterprise data warehouse that helps you manage and analyze your data with built-in features like machine learning geospatial analysis, and business intelligence.
- AlloyDB for PostgreSQL: A fully managed, PostgreSQL-compatible database service that's designed for your most demanding workloads, including hybrid transactional and analytical processing.
- MCP Toolbox for Databases: An open-source Model Context Protocol (MCP) server that lets AI agents securely connect to databases by managing database complexities like connection pooling, authentication, and observability.
Deployment
To deploy a sample implementation of this architecture, use Data Science with Multiple Agents. The repository provides two sample datasets to demonstrate the system's flexibility, including a flight dataset for operational analysis and an ecommerce sales dataset for business analytics.
What's next
- (Video) Watch the Agent Factory Podcast about AI agents for data engineering and data science.
- (Notebook) Use the data science agent in Colab Enterprise.
- Learn about how to host AI agents on Cloud Run.
- For an overview of architectural principles and recommendations that are specific to AI and ML workloads in Google Cloud, see the AI and ML perspective in the Well-Architected Framework.
- For more reference architectures, diagrams, and best practices, explore the Cloud Architecture Center.
Contributors
Author: Samantha He | Technical Writer
Other contributors:
- Amina Mansour | Head of Cloud Platform Evaluations Team
- Kumar Dhanagopal | Cross-Product Solution Developer
- Megan O'Keefe | Developer Advocate
- Rachael Deacon-Smith | Developer Advocate
- Shir Meir Lador | Developer Relations Engineering Manager