This document describes how your Google Kubernetes Engine deployment can use Google Cloud Managed Service for Prometheus to collect metrics from llm-d. llm-d consists of many components, including GKE Inference Gateway and vLLM.
For information about collecting metrics from GKE Inference Gateway and vLLM, see the following documents:
- GKE Inference Gateway
- vLLM. Use the configuration for the PodMonitoring resource described in this document.
The instructions in these documents apply only if you are using managed collection with Managed Service for Prometheus. If you are using self-deployed collection, then see the llm-d documentation.
After you configure GKE Inference Gateway and vLLM, you can access a predefined dashboard in Cloud Monitoring to view the metrics.
Prerequisites
To collect metrics from llm-d by using Managed Service for Prometheus and managed collection, your deployment must meet the following requirements:
- Your cluster must be running Google Kubernetes Engine version 1.28.15-gke.2475000 or later.
- You must be running Managed Service for Prometheus with managed collection enabled. For more information, see Get started with managed collection.
You must also change the configuration of the PodMonitoring resource for vLLM. Use the following configuration:
apiVersion: monitoring.googleapis.com/v1
kind: PodMonitoring
metadata:
name: llm-d-metrics
spec:
selector:
matchLabels:
llm-d.ai/model: ms-pd-llm-d-modelservice
endpoints:
- port: 8200
interval: 10s
path: /metrics
targetLabels:
fromPod:
- from: llm-d.ai/role
to: role
metadata:
- pod
- container
- node
- top_level_controller_name
- top_level_controller_type
View dashboards
The Cloud Monitoring integration includes the llm-d Prometheus Overview dashboard. Dashboards are automatically installed when you configure the integration. You can also view static previews of dashboards without installing the integration.
To view an installed dashboard, do the following:
-
In the Google Cloud console, go to the
Dashboards page:
If you use the search bar to find this page, then select the result whose subheading is Monitoring.
- Select the Dashboard List tab.
- Choose the Integrations category.
- Click the name of the dashboard, for example, llm-d Prometheus Overview.
To view a static preview of the dashboard, do the following:
-
In the Google Cloud console, go to the
Integrations
page:
If you use the search bar to find this page, then select the result whose subheading is Monitoring.
- Click the Kubernetes Engine deployment-platform filter.
- Locate the llm-d integration and click View Details.
- Select the Dashboards tab.