Log e metriche

Questo documento descrive i log e le metriche che Gemini on Google Distributed Cloud connected API raccoglie ed esporta.

Configurazione di logging e monitoraggio

Prima di poter iniziare a raccogliere log e metriche, devi:

  1. Abilitare le API Logging utilizzando i seguenti comandi:

    gcloud services enable opsconfigmonitoring.googleapis.com --project PROJECT_ID
    gcloud services enable logging.googleapis.com --project PROJECT_ID
    gcloud services enable monitoring.googleapis.com --project PROJECT_ID
    

    Sostituisci PROJECT_ID con l'ID del progetto di destinazione Google Cloud .

  2. Concedere i ruoli necessari per scrivere log e metriche:

    gcloud projects add-iam-policy-binding PROJECT_ID \
        --role roles/opsconfigmonitoring.resourceMetadata.writer \
        --member "serviceAccount:PROJECT_ID.svc.id.goog[kube-system/metadata-agent]"
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --role roles/logging.logWriter \
         --member "serviceAccount:PROJECT_ID.svc.id.goog[kube-system/stackdriver-log-forwarder]"
    
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --role roles/monitoring.metricWriter \
        --member "serviceAccount:PROJECT_ID.svc.id.goog[kube-system/gke-metrics-agent]"
    

    Sostituisci PROJECT_ID con l'ID del progetto di destinazione Google Cloud .

Log

Questa sezione elenca i tipi di risorse Cloud Logging supportati da Gemini on GDC connected API. Per visualizzare i log di Gemini on GDC connected API, utilizza l' Esplora log nella Google Cloud console. Il logging di Gemini on GDC connected API è sempre abilitato.

Il tipo di risorsa registrata di Gemini on GDC connected API è aiplatform.googleapis.com/Endpoint.

Puoi anche acquisire e recuperare i log di Gemini on GDC connected API utilizzando l'API Cloud Logging. Per informazioni su come configurare questo meccanismo di logging, consulta la documentazione delle librerie client di Cloud Logging.

Metriche

Questa sezione elenca le metriche di Cloud Monitoring supportate da Gemini on GDC connected API. Per visualizzare le metriche di Gemini on GDC connected API, utilizza l' Esplora metriche nella Google Cloud console.

Metriche dei cluster Distributed Cloud connected

Gli endpoint di Gemini on GDC connected API vengono implementati sui cluster Distributed Cloud connected. Per informazioni su log e metriche per Distributed Cloud connected, consulta Log e metriche.

Metriche di Inference Gateway

Nome metrica Prometheus Tipo di metriche Tipo di dati Etichette Tipo di Chemist metric_kind di Chemist value_type di Chemist Etichette di Chemist
ig_ops_successful_incoming_requests Contatore modello aiplatform.googleapis.com/prediction/internal/gdc/ig/successful_requests CUMULATIVE INT64 modello
ig_ops_unique_users Contatore modello aiplatform.googleapis.com/prediction/internal/gdc/ig/unique_users CUMULATIVE INT64 modello
ig_tokens_per_minute Istogramma double modello aiplatform.googleapis.com/prediction/internal/gdc/ig/tokens_per_min CUMULATIVE DISTRIBUTION modello
ig_total_response_time Istogramma double modello aiplatform.googleapis.com/prediction/internal/gdc/ig/response_time CUMULATIVE DISTRIBUTION modello
ig_ops_ffmpeg_image_latency Istogramma double modello aiplatform.googleapis.com/prediction/internal/gdc/ig/ffmpeg_image_latencies CUMULATIVE DISTRIBUTION modello
ig_ops_ffmpeg_video_latency Istogramma double modello aiplatform.googleapis.com/prediction/internal/gdc/ig/ffmpeg_video_latencies CUMULATIVE DISTRIBUTION modello
ig_ops_ffmpeg_audio_latency Istogramma double modello aiplatform.googleapis.com/prediction/internal/gdc/ig/ffmpeg_audio_latencies CUMULATIVE DISTRIBUTION modello
ig_time_to_first_token Istogramma double modello context_window aiplatform.googleapis.com/prediction/internal/gdc/ig/ttft CUMULATIVE DISTRIBUTION modello context_window
ig_time_per_output_token Istogramma double modello context_window aiplatform.googleapis.com/prediction/internal/gdc/ig/tpot CUMULATIVE DISTRIBUTION modello context_window
ig_cache_hit Contatore modello aiplatform.googleapis.com/prediction/internal/gdc/ig/cache_hit_count CUMULATIVE DISTRIBUTION modello _gdch_project
ig_cache_miss Contatore modello aiplatform.googleapis.com/prediction/internal/gdc/ig/cache_miss_count CUMULATIVE DISTRIBUTION modello _gdch_project

Metriche di GenAI Router

Nome metrica Prometheus Tipo di metriche Tipo di dati Etichette Tipo di Chemist metric_kind di Chemist value_type di Chemist Etichette di Chemist
llm_total_request_latency_milliseconds Istogramma double context_window modello aiplatform.googleapis.com/prediction/internal/gdc/gair/total_request_latencies CUMULATIVE DISTRIBUTION context_window modello
llm_unary_request_latency_milliseconds Istogramma double context_window modello aiplatform.googleapis.com/prediction/internal/gdc/gair/unary_request_latencies CUMULATIVE DISTRIBUTION context_window modello
llm_streaming_ttft_milliseconds Istogramma double context_window modello aiplatform.googleapis.com/prediction/internal/gdc/gair/ttft_ms CUMULATIVE DISTRIBUTION context_window modello
llm_streaming_tpot_milliseconds Istogramma double context_window modello aiplatform.googleapis.com/prediction/internal/gdc/gair/tpot_ms CUMULATIVE DISTRIBUTION context_window modello
llm_input_token_count Istogramma double modello aiplatform.googleapis.com/prediction/internal/gdc/gair/input_token_count CUMULATIVE DISTRIBUTION modello
llm_output_token_count Istogramma double modello aiplatform.googleapis.com/prediction/internal/gdc/gair/output_token_count CUMULATIVE DISTRIBUTION modello
llm_success_response_count Contatore double modello aiplatform.googleapis.com/prediction/internal/gdc/gair/success_response_count CUMULATIVE INT64 modello
llm_failure_response_count Contatore double modello aiplatform.googleapis.com/prediction/internal/gdc/gair/failure_response_count CUMULATIVE INT64 modello
llm_text_tokenization_latency_milliseconds Istogramma double modello aiplatform.googleapis.com/prediction/internal/gdc/gair/text_tokenization_latencies CUMULATIVE DISTRIBUTION modello
llm_image_tokenization_latency_milliseconds Istogramma double aiplatform.googleapis.com/prediction/internal/gdc/gair/image_tokenization_latencies CUMULATIVE DISTRIBUTION
llm_audio_tokenization_latency_milliseconds Istogramma double aiplatform.googleapis.com/prediction/internal/gdc/gair/audio_tokenization_latencies CUMULATIVE DISTRIBUTION

Metriche GPU

Nome metrica Prometheus Tipo di metriche Tipo di dati Etichette Tipo di Chemist metric_kind di Chemist value_type di Chemist Etichette di Chemist
DCGM_FI_DEV_MEM_COPY_UTIL Misuratore int64 gpu UUID pci_bus_id device modelName Hostname DCGM_FI_DRIVER_VERSION aiplatform.googleapis.com/prediction/internal/gdc/gpu/memory_util GAUGE INT64 uuid gpu_model
DCGM_FI_DEV_MEMORY_TEMP Misuratore int64 Come sopra aiplatform.googleapis.com/prediction/internal/gdc/gpu/memory_temp GAUGE INT64 Come sopra
DCGM_FI_DEV_POWER_USAGE Misuratore double Come sopra aiplatform.googleapis.com/prediction/internal/gdc/gpu/power_usage GAUGE DOUBLE Come sopra
DCGM_FI_DEV_GPU_TEMP Misuratore double Come sopra aiplatform.googleapis.com/prediction/internal/gdc/gpu/gpu_temp GAUGE INT64 Come sopra
DCGM_FI_DEV_GPU_UTIL Misuratore double Come sopra aiplatform.googleapis.com/prediction/internal/gdc/gpu/gpu_util GAUGE INT64 Come sopra
DCGM_FI_DEV_ENC_UTIL Misuratore int64 Come sopra aiplatform.googleapis.com/prediction/internal/gdc/gpu/encode_util GAUGE INT64 Come sopra
DCGM_FI_DEV_XID_ERRORS Contatore int64 Come sopra aiplatform.googleapis.com/prediction/internal/gdc/gpu/xid_errors CUMULATIVE INT64 Come sopra
DCGM_FI_DEV_POWER_VIOLATION Contatore int64 Come sopra aiplatform.googleapis.com/prediction/internal/gdc/gpu/violation_power CUMULATIVE INT64 Come sopra
DCGM_FI_DEV_THERMAL_VIOLATION Contatore int64 Come sopra aiplatform.googleapis.com/prediction/internal/gdc/gpu/violation_thermal CUMULATIVE INT64 Come sopra
DCGM_FI_DEV_SYNC_BOOST_VIOLATION Contatore int64 Come sopra aiplatform.googleapis.com/prediction/internal/gdc/gpu/violation_sync_boost CUMULATIVE INT64 Come sopra
DCGM_FI_DEV_BOARD_LIMIT_VIOLATION Contatore int64 Come sopra aiplatform.googleapis.com/prediction/internal/gdc/gpu/violation_board_limit CUMULATIVE INT64 Come sopra
DCGM_FI_DEV_LOW_UTIL_VIOLATION Contatore int64 Come sopra aiplatform.googleapis.com/prediction/internal/gdc/gpu/violation_low_util CUMULATIVE INT64 Come sopra
DCGM_FI_DEV_RELIABILITY_VIOLATION Contatore int64 Come sopra aiplatform.googleapis.com/prediction/internal/gdc/gpu/violation_reliability CUMULATIVE INT64 Come sopra