Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

TorchServe

במאמר הזה מוסבר איך להגדיר את הפריסה שלכם ב-Google Kubernetes Engine כדי שתוכלו להשתמש בשירות המנוהל של Google Cloud ל-Prometheus כדי לאסוף מדדים מ-TorchServe. במאמר הזה מוסבר איך:

מגדירים את TorchServe לדיווח על מדדים.
אפשר לגשת למרכז בקרה מוגדר מראש ב-Cloud Monitoring כדי לראות את המדדים.

ההוראות האלה רלוונטיות רק אם אתם משתמשים ב אוסף מנוהל עם שירות מנוהל ל-Prometheus. אם אתם משתמשים באיסוף שמוטמע באופן עצמאי, תוכלו לעיין במסמכי התיעוד של TorchServe כדי לקבל מידע על התקנה.

ההוראות האלה הן דוגמה, והן צפויות לפעול ברוב סביבות Kubernetes. אם נתקלתם בבעיה בהתקנת אפליקציה או כלי לייצוא בגלל מדיניות אבטחה או מדיניות ארגונית מגבילה, מומלץ לעיין במסמכי קוד פתוח לקבלת תמיכה.

מידע על TorchServe זמין במאמר TorchServe. מידע על הגדרת TorchServe ב-Google Kubernetes Engine זמין במדריך GKE ל-TorchServe.

דרישות מוקדמות

כדי לאסוף מדדים מ-TorchServe באמצעות שירות מנוהל ל-Prometheus ואיסוף מנוהל, הפריסה צריכה לעמוד בדרישות הבאות:

האשכול צריך להריץ את Google Kubernetes Engine בגרסה ‎1.28.15-gke.2475000 ואילך.
צריך להפעיל את השירות המנוהל ל-Prometheus עם איסוף מנוהל. מידע נוסף זמין במאמר תחילת העבודה עם אוסף מנוהל.

‫TorchServe חושף מדדים בפורמט Prometheus באופן אוטומטי כשמציינים את הדגל metrics_mode בקובץ config.properties או כמשתנה סביבה.

אם אתם מגדירים את TorchServe בעצמכם, מומלץ להוסיף את השורות הבאות לקובץ config.properties.

אם אתם פועלים לפי המאמר בנושא Google Kubernetes Engine‏ Serve scalable LLMs on GKE with TorchServe, התוספות האלה הן חלק מההגדרה שמוגדרת כברירת מחדל.

# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

  inference_address=http://0.0.0.0:8080
  management_address=http://0.0.0.0:8081
+ metrics_address=http://0.0.0.0:8082
+ metrics_mode=prometheus
  number_of_netty_threads=32
  job_queue_size=1000
  install_py_dep_per_model=true
  model_store=/home/model-server/model-store
  load_models=all

בנוסף, כשפורסים את התמונה הזו ב-GKE, צריך לשנות את קובץ ה-YAML של הפריסה והשירות כדי לחשוף את יציאת המדדים שנוספה:

# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: t5-inference
  labels:
    model: t5
    version: v1.0
    machine: gpu
spec:
  replicas: 1
  selector:
    matchLabels:
      model: t5
      version: v1.0
      machine: gpu
  template:
    metadata:
      labels:
        model: t5
        version: v1.0
        machine: gpu
    spec:
      nodeSelector:
        cloud.google.com/gke-accelerator: nvidia-l4
      containers:
        - name: inference
          ...
          args: ["torchserve", "--start", "--foreground"]
          resources:
            ...
          ports:
            - containerPort: 8080
              name: http
            - containerPort: 8081
              name: management
+           - containerPort: 8082
+             name: metrics
---
apiVersion: v1
kind: Service
metadata:
  name: t5-inference
  labels:
    model: t5
    version: v1.0
    machine: gpu
spec:
  ...
  ports:
    - port: 8080
      name: http
      targetPort: http
    - port: 8081
      name: management
      targetPort: management
+   - port: 8082
+     name: metrics
+     targetPort: metrics

כדי לוודא ש-TorchServe פולט מדדים בנקודות הקצה הצפויות:

מגדירים העברה ליציאה אחרת באמצעות הפקודה הבאה:
```
kubectl -n NAMESPACE_NAME port-forward SERVICE_NAME 8082
```
ניגשים לנקודת הקצה localhost:8082/metrics באמצעות הדפדפן או כלי השירות curl בסשן טרמינל אחר.

הגדרה של משאב PodMonitoring

כדי לגלות את היעד, ל-שירות מנוהל ל-Prometheus Operator נדרש משאב PodMonitoring שתואם ל-TorchServe באותו מרחב שמות.

אפשר להשתמש בהגדרה הבאה של PodMonitoring:

# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: monitoring.googleapis.com/v1
kind: PodMonitoring
metadata:
  name: torchserve
  labels:
    app.kubernetes.io/name: torchserve
    app.kubernetes.io/part-of: google-cloud-managed-prometheus
spec:
  endpoints:
  - port: 8082
    scheme: http
    interval: 30s
    path: /metrics
  selector:
    matchLabels:
      model: t5
      version: v1.0
      machine: gpu

מוודאים שהערכים בשדות port ו-matchLabels תואמים לערכים של ה-pods של TorchServe שרוצים לעקוב אחריהם.

כדי להחיל שינויים בתצורה מקובץ מקומי, מריצים את הפקודה הבאה:

kubectl apply -n NAMESPACE_NAME -f FILE_NAME

אפשר גם להשתמש ב-Terraform כדי לנהל את ההגדרות.

אימות ההגדרה

אתם יכולים להשתמש ב-Metrics Explorer כדי לוודא שהגדרתם את TorchServe בצורה נכונה. יכול להיות שיחלפו דקה או שתיים עד שמערכת Cloud Monitoring תעבד את המדדים.

כדי לוודא שהמדדים נאספים, מבצעים את הפעולות הבאות:

נכנסים לדף Metrics explorer במסוף Google Cloud :
כניסה אל Metrics Explorer

אם משתמשים בסרגל החיפוש כדי למצוא את הדף הזה, בוחרים בתוצאה שבה הכותרת המשנית היא Monitoring.
בסרגל הכלים של החלונית ליצירת שאילתות, לוחצים על הלחצן ששמו PromQL.

מזינים ומריצים את השאילתה הבאה:

up{job="torchserve", cluster="CLUSTER_NAME", namespace="NAMESPACE_NAME"}

צפייה במרכזי הבקרה

השילוב עם Cloud Monitoring כולל את לוח הבקרה TorchServe Prometheus Overview. לוחות הבקרה מותקנים באופן אוטומטי כשמגדירים את השילוב. אפשר גם לראות תצוגות מקדימות סטטיות של מרכזי בקרה בלי להתקין את האינטגרציה.

כדי לראות מרכז בקרה שהותקן:

במסוף Google Cloud , עוברים לדף Dashboards:
עוברים אל מרכזי בקרה.

אם משתמשים בסרגל החיפוש כדי למצוא את הדף הזה, בוחרים בתוצאה שבה הכותרת המשנית היא Monitoring.
לוחצים על הכרטיסייה רשימת מרכזי בקרה.
בוחרים בקטגוריה שילובים.
לוחצים על השם של מרכז הבקרה, לדוגמה, TorchServe Prometheus Overview.

כדי לראות תצוגה מקדימה סטטית של מרכז הבקרה:

נכנסים לדף Integrations במסוף Google Cloud :
עוברים אל שילובים

אם משתמשים בסרגל החיפוש כדי למצוא את הדף הזה, בוחרים בתוצאה שבה הכותרת המשנית היא Monitoring.
לוחצים על המסנן Kubernetes Engine של פלטפורמת הפריסה.
מאתרים את השילוב של TorchServe ולוחצים על הצגת הפרטים.
לוחצים על הכרטיסייה מרכזי בקרה.

פתרון בעיות

מידע על פתרון בעיות בהוספת מדדים זמין במאמר פתרון בעיות שקשורות להוספה בקטע בעיות באיסוף נתונים ממייצאים.

TorchServe קל לארגן דפים בעזרת אוספים אפשר לשמור ולסווג תוכן על סמך ההעדפות שלך.