Generative AI architecture and use cases

This document describes a typical generative AI architecture in Google Cloud. It also describes when you would use specific Google Cloud services.

Architecture

The following diagram shows the Google Cloud services that are present in a typical generative AI architecture that uses Vertex AI.

Sample architecture for generative AI workloads that use Vertex AI.

This diagram includes the following:

  • Artifact Registry streamlines your machine learning (ML) development and deployment process, improves collaboration, and ensures the security and reliability of your ML models.

  • BigQuery simplifies data access, enable scalable analysis, and use its ML capabilities in your ML workflows.

  • Cloud Audit Logs tracks the actions that your users make in your environment, which enhances your troubleshooting, auditing, and incident response capabilities.

  • Cloud Billing dashboards and alerts let you review usage and billing of the Vertex AI workloads.

  • Cloud Build builds, tests, and deploys a serverless CI/CD platform on Google Cloud.

  • Cloud Identity unifies identity, access, application, and management for Google Cloud.

  • Cloud Run functions automates tasks, serves predictions, triggers training jobs, integrates with other services, and builds event-driven ML pipelines.

  • Cloud Storage stores training data, model artifacts, and production data.

  • Dataflow builds complex pipelines that ingest data from various sources and aggregate the data as appropriate.

  • Cloud DNS registers, manages, and serves your domain.

  • Identity and Access Management (IAM) controls who can perform specific actions on your generative workload resources, such as creating, editing, or deleting them.

  • Organization Policy Service centrally manages and enforces policies across your Google Cloud environment. Organization Policy helps to ensure consistent configuration and security compliance across the projects and resources within your organization.

  • Pub/Sub enables efficient communication and automation within your machine learning workflows.

  • Resource Manager helps group and manage logical components of your Vertex AI workloads.

  • Secret Manager helps protect the sensitive data and credentials that are used in Vertex AI projects.

  • Sensitive Data Protection automates discovery of sensitive data in your data sets. Sensitive Data Protection can scan prompts and redact sensitive data before the data reaches the model. Sensitive Data Protection can also scan the model's output to avoid leaking sensitive training data in responses.

  • Security Command Center helps protect your cloud organization, your AI workloads, and the AI data that you store on Google Cloud. Security Command Center provides the following:

    • Centralized security management
    • Threat detection and incident response
    • Automated security assessments
    • Compliance and regulatory reporting
    • Security recommendations and best practices
  • Vertex AI lets you build and use generative AI, including AI solutions, search, and conversation, on a single platform.

  • Virtual Private Cloud (VPC) isolates your AI resources from the internet in a secure environment. This network configuration helps protect sensitive data and models from unauthorized access and potential cyberattacks.

  • Cloud VPN or Cloud Interconnect establishes a secure network connection between your on-premises infrastructure and your Vertex AI environment. Cloud VPN or Cloud Interconnect helps enable seamless data transfer and communication between your private network and Google Cloud resources. Consider this integration for scenarios like accessing on-premises data for model training or deploying models to on-premises resources for inference.

Use cases for Artifact Registry

Consider the following use cases for Artifact Registry with Vertex AI:

  • Manage your ML artifacts: Artifact Registry lets you store and manage all your ML artifacts in a single place, including model training code, datasets, trained models, and prediction serving containers. You can use this centralized repository to track, share, and reuse your ML artifacts across different teams and projects.
  • Version control and reproducibility: Artifact Registry provides version control for your ML artifacts, helping you track changes and roll back to previous versions, if needed. This feature is crucial for ensuring the reproducibility of your ML experiments and deployments.
  • Secure and reliable storage: Artifact Registry offers secure and reliable storage for your ML artifacts. These artifacts are encrypted at rest and in transit. Configure access control to restrict who can access the artifacts to help protect your valuable data and intellectual property.
  • Integration with Vertex AI Pipelines: integrate Artifact Registry with Vertex AI Pipelines to build and automate your ML workflows. Use Artifact Registry to store your pipeline artifacts (for example, your pipeline definitions, code, and data) and to automatically trigger pipeline runs when new artifacts are uploaded.
  • Streamline CI/CD for ML: integrate Artifact Registry with your CI/CD tooling to streamline the development and deployment of your ML models. For example, use Artifact Registry to automatically build and deploy your model serving container whenever you push a new version of your model to Artifact Registry.
  • Multi-region support: Artifact Registry lets you store your artifacts in multiple regions, which can help improve the performance and availability of your ML models, especially if you have users located in different parts of the world.

Use cases for BigQuery

Consider the following use cases for BigQuery with Vertex AI:

  • Seamless integration: BigQuery and Vertex AI are tightly integrated, letting you access and analyze your data directly within the Vertex AI platform. This integration eliminates the need for data movement, streamlines your ML workflow, and reduces friction.
  • Scalable data analysis: BigQuery offers a petabyte-scale data warehouse, letting you analyze massive datasets without worrying about infrastructure limitations. This scalability is critical for training and deploying ML models that require vast amounts of data.
  • SQL-based ML: BigQuery ML lets you use familiar SQL commands to train and deploy models directly within BigQuery. This feature lets data analysts and SQL practitioners use ML capabilities without requiring advanced coding skills.
  • Online and batch predictions: BigQuery ML supports online and batch predictions. You can run real-time predictions on individual rows or generate predictions for large datasets in batch mode. This flexibility permits diverse use cases with varying latency requirements.
  • Reduced data movement: with BigQuery ML, you don't need to move your data to separate storage or compute resources for model training and deployment. This reduced movement simplifies your workflow, reduces latency, and minimizes cost associated with data transfer.
  • Model monitoring: Vertex AI provides comprehensive model monitoring capabilities, letting you track the performance, fairness, and explainability of your BigQuery ML models. Model monitoring helps you ensure that your models are performing as expected and address potential issues.
  • Pretrained models: Vertex AI offers access to pretrained models, including those for natural language processing and computer vision. You can use these models within BigQuery to enhance your analysis and extract deeper insights from your data.
  • Cost-effective solution: BigQuery ML offers a cost-effective, flexible way to train and deploy ML models. You only pay for the resources you use, making it an affordable option for organizations of all sizes.
  • Advanced analytics capabilities: BigQuery provides tools for advanced analytics, including geospatial analysis and forecasting. These tools let you combine ML with other analytical techniques for deeper data exploration and richer insights.
  • Enhanced collaboration: by using BigQuery with Vertex AI, data scientists, ML engineers, and analysts can collaborate seamlessly on ML projects. This collaboration helps create a more integrated and efficient approach to tackling complex data problems.

Use cases for Cloud Build

Consider the following use cases for Cloud Build with Vertex AI:

  • Automate ML pipeline builds: Cloud Build lets you automate the building and testing of your ML pipelines defined in Vertex AI Pipelines. This automation helps you build and deploy your models faster and with greater consistency.
  • Build custom container images for deployment: Cloud Build can build custom container images for your model-serving environments. Cloud Build lets you package your model code, dependencies, and runtime environment into a single image that you can deploy to Vertex AI Inference for serving predictions.
  • Integrate with CI/CD workflows: Cloud Build lets you automate the build and deployment of your ML models in your CI/CD workflows. This automation ensures that your models are up-to-date and deployed to production.
  • Trigger builds based on code changes: Cloud Build can automatically trigger builds when changes are made to your model code or pipeline definition. This automation helps to ensure that your models are built with the latest code and that any changes are automatically deployed to production.
  • Get scalable and secure infrastructure: Cloud Build uses Google Cloud scalable and secure infrastructure to build and deploy your models. This scalability means you don't need to worry about managing your own infrastructure and can focus on developing your models.
  • Support for various programming languages: Cloud Build supports various programming languages, including Python, Java, Go, and Node.js. This support lets you build your models using the language of your choice.
  • Use prebuilt build steps: to help simplify the build process, Cloud Build offers prebuilt build steps for common ML tasks, such as installing dependencies, running tests, and pushing images to container registries.
  • Create custom build steps: you can define your own custom build steps in Cloud Build to execute any arbitrary code during the build process.
  • Build artifacts for other Vertex AI services: Cloud Build can build artifacts for other Vertex AI services such as Vertex AI Feature Store and Vertex AI Data Labeling. This flexibility helps you build a complete ML workflow on Google Cloud.
  • Realize a cost-effective solution: Cloud Build offers a pay-as-you-go pricing model, so you only pay for the resources you use.

Use cases for Cloud Storage

Consider the following use cases for Cloud Storage with Vertex AI:

  • Store training data: Vertex AI lets you store your training datasets in Cloud Storage buckets. Using Cloud Storage offers several advantages:
    • Cloud Storage can handle datasets of any size, allowing you to train models on massive amounts of data without storage limitations.
    • You can set granular access controls and encryption on your Cloud Storage buckets to ensure that your sensitive training data is protected.
    • Cloud Storage lets you track changes and revert to previous versions of your data, providing valuable audit trails and facilitating reproducible training experiments.
    • Vertex AI seamlessly integrates with Cloud Storage, letting you access your training data within the platform.
  • Store model artifacts: you can store trained model artifacts such as including model files, hyperparameter configurations, and training logs, in Cloud Storage buckets. Using Cloud Storage lets you do the following:
    • Keep all your model artifacts in Cloud Storage as a centralized repository to conveniently access and manage them.
    • Track and manage different versions of your models, facilitating comparisons and rollbacks if needed.
    • Grant teammates and collaborators access to specific Cloud Storage buckets to efficiently share models.
  • Store production data: for models used in production, Cloud Storage can store the data being fed to the model for prediction. For example, you can use Cloud Storage to do the following:
    • Store user data and interactions for real-time personalized recommendations.
    • Keep images for on-demand processing and classification using your models.
    • Maintain transaction data for real-time fraud identification using your models.
  • Integrate with other services: Cloud Storage integrates seamlessly with other Google Cloud services used in Vertex AI workflows, such as the following:
    • Dataflow for streamline data preprocessing and transformation pipelines.
    • BigQuery for access to large datasets stored in BigQuery for model training and inference.
    • Cloud Run functions for actions based on model predictions or data changes in Cloud Storage buckets.
  • Manage costs: Cloud Storage offers a pay-as-you-go pricing model, meaning you only pay for the storage you use. This provides cost efficiency, especially for large datasets.
  • Enable high availability and durability: Cloud Storage ensures your data is highly available and protected against failures or outages, guaranteeing reliability and robust access to your ML assets.
  • Enable multi-region support: store your data in multiple Cloud Storage regions that are geographically closer to your users or applications, enhancing performance and reducing latency for data access and model predictions.

Use cases for Cloud Run functions

Consider the following use cases for Cloud Run functions with Vertex AI:

  • Ability to preprocess and post-process data: Cloud Run functions can preprocess data before sending it to your Vertex AI model for training or prediction. For example, a function can clean and normalize data, or extract features from it. Similarly, Cloud Run functions can post-process the output of your Vertex AI model. For example, a function can format the output data, or to send it to another service for further analysis.
  • Automatic triggers for Vertex AI training jobs: to automate the training of Vertex AI models, you can trigger Cloud Run functions using events from various Google Cloud services, such as Cloud Storage, Pub/Sub, and Cloud Scheduler. For example, you can create a function that is triggered when a new file is uploaded to Cloud Storage. This function can start a Vertex AI training job to train your model on the new data.
  • Ability to serve predictions: Cloud Run functions can serve predictions from your Vertex AI models, letting you create an API endpoint for your model without having to manage any infrastructure. For example, you can write a function that takes an image as input, and outputs a prediction from your Vertex AI image classification model. You can then deploy this function as an HTTP API endpoint.
  • Event-driven ML workflows: you can use Cloud Run functions to build event-driven ML workflows. For example, a function can trigger a Vertex AI prediction job when a new record is added to a Pub/Sub topic. This function lets you process data in real time and take action based on your model predictions.
  • Integration with other services: you can integrate Cloud Run functions with other Google Cloud services, such as Cloud Storage, BigQuery, and Cloud Firestore. Integration lets you build complex ML pipelines that connect different services together.
  • Cost scaling: Cloud Run functions lets you only pay for the resources that your function uses while it's running. Additionally, Cloud Run functions are automatically scaled to meet demand, so that you maintain appropriate resources during peak traffic.

Use cases for Pub/Sub

Consider the following use cases for Pub/Sub with Vertex AI:

  • Asynchronous event-driven architecture: Pub/Sub enables event-driven communication so that you can trigger Vertex AI pipelines based on events that are published to Pub/Sub topics. These events can include new data and model updates.
  • Scalability and reliability: Pub/Sub is highly scalable, letting you handle numerous events without impacting performance. Scalability is critical for processing large datasets or running multiple concurrent ML jobs. Pub/Sub also provides reliable message delivery and ordering within a topic, ensuring processing consistency even under heavy workloads.
  • Flexibility: you can integrate Vertex AI with other services like Cloud Run functions or Dataflow using Pub/Sub, creating flexible and dynamic ML pipelines.
  • Real-time monitoring and alerts: Pub/Sub lets you subscribe to specific topics to receive real-time notifications about events in your Vertex AI pipelines. Real-time monitoring helps you to monitor model training progress, data preprocessing results, and prediction output. You can configure alerts based on specific events, like failed jobs or anomalies detected during prediction. Alerts enable proactive intervention and timely troubleshooting.

For example, you can use Pub/Sub for the following activities:

  • Trigger model training when new data arrives in a Cloud Storage bucket.
  • Send real-time predictions from a deployed model to downstream systems for further processing.
  • Monitor and react to changes in model performance metrics.
  • Trigger alerts for critical events like failed predictions or data quality issues.

Use cases for Resource Manager

Consider the following use cases for Resource Manager with Vertex AI:

  • To help ensure resource and data isolation and fine-grained access controls, create separate projects for different teams or departments.
  • Apply protective security policies to AI workloads.
  • Define quotas for GPU usage in training jobs to prevent cost overruns.
  • Automate the creation of required Cloud Storage buckets and Compute Engine instances for new projects.
  • Track and analyze resource usage patterns for specific projects to optimize resource allocation.
  • Generate audit reports to demonstrate compliance with data governance and security policies.

Use cases for Secret Manager

Consider the following use cases for Secret Manager with Vertex AI:

  • Store API keys for accessing external data sources used in model training.
  • Encrypt database credentials within prediction pipelines for secure access.
  • Provision temporary access tokens for secure communication between services.
  • Secure private keys and certificates that you use for encrypting communication channels.
  • Manage passwords and credentials for third-party services that you use in your ML workflows.

Use cases for VPC

Consider the following use cases for VPC with Vertex AI:

  • Define granular firewall rules and access controls within your VPC network to restrict traffic and only allow authorized connections to specific resources.

  • Organize your Vertex AI resources into separate VPC networks based on function or security requirements. This type of organization helps isolate resources and prevents unauthorized access between different projects or teams. You can create dedicated VPC networks for sensitive workloads, such as training models with confidential data, ensuring that only authorized users and services have network access.

What's next