View Vertex AI Inference AI AutoMetrics

This document describes how to use AI AutoMetrics to monitor your AI workloads on Vertex AI.

AI AutoMetrics lets you to monitor the performance and health of your models with minimal configuration. This feature is designed to give you immediate insights into your custom containers and models running on Vertex AI Inference.

Before you begin

Make sure you have a Vertex AI endpoint with a deployed model that uses a container with supported frameworks.
Make sure your project has enabled Cloud Monitoring. See For more information, see Enable the Monitoring API.

Use AI AutoMetrics

To view AI AutoMetrics on Metrics Explorer, do the following:

Go to the Metrics Explorer page in the Google Cloud console.

Go to Metrics Explorer
Under Select a metric, select Prometheus Target.
Under Active metric categories, select Vertex.
Under Active metrics, select the desired metric.
Click Apply.

You can also query metrics using Grafana, or Prometheus API or UI.

Supported Frameworks

AI AutoMetrics supports the following frameworks:

Framework	Qualified endpoint	Qualified metrics
vLLM	Prometheus-compatible `/metrics` endpoint	Metrics with `vllm:` prefix

How it works

Vertex AI automatically scrapes the /metrics endpoint of your container at a predefined interval. All qualified metrics are then exported to Google Cloud Google Cloud Managed Service for Prometheus, where you can analyze and visualize them.

Metric naming and labels

The metrics collected by AI AutoMetrics are ingested into Cloud Monitoring under the vertex_* naming convention.

For easier filtering and grouping, AI AutoMetrics automatically attaches the following additional Vertex AI labels to each metric:

deployed_model_id: the ID of a deployed model which serves inference requests.
model_display_name: the display name of a deployed model.
replica_id: the unique ID corresponding to the deployed model replica (pod name).
endpoint_id: the ID of a model endpoint.
endpoint_display_name: the display name of a model endpoint.
product: the name of the feature under Vertex AI. This is always Online Inference.

What's next

Learn more about the Metrics Explorer.

View Vertex AI Inference AI AutoMetrics Stay organized with collections Save and categorize content based on your preferences.