Men-deploy dan menggunakan pengumpul

Dokumen ini menjelaskan cara men-deploy OpenTelemetry Collector, mengonfigurasi Collector untuk menggunakan eksportir otlphttp dan Telemetry (OTLP) API, serta menjalankan generator telemetri untuk menulis metrik ke Cloud Monitoring. Kemudian, Anda dapat melihat metrik ini di Cloud Monitoring.

Jika menggunakan Google Kubernetes Engine, Anda dapat mengikuti Managed OpenTelemetry untuk GKE, bukan men-deploy dan mengonfigurasi OpenTelemetry Collector secara manual yang menggunakan Telemetry API.

Jika Anda menggunakan SDK untuk mengirim metrik dari aplikasi langsung ke Telemetry API, lihat Menggunakan SDK untuk mengirim metrik dari aplikasi untuk mengetahui informasi dan contoh tambahan.

Anda juga dapat menggunakan OpenTelemetry Collector dan Telemetry API bersama dengan instrumentasi tanpa kode OpenTelemetry. Untuk mengetahui informasi selengkapnya, lihat Menggunakan instrumentasi tanpa kode OpenTelemetry untuk Java.

Sebelum memulai

Bagian ini menjelaskan cara menyiapkan lingkungan untuk men-deploy dan menggunakan pengumpul.

Pilih atau buat Google Cloud project

Pilih Google Cloud project untuk panduan ini. Jika Anda belum memiliki project Google Cloud , buat project:

  1. Login ke akun Google Cloud Anda. Jika Anda baru menggunakan Google Cloud, buat akun untuk mengevaluasi performa produk kami dalam skenario dunia nyata. Pelanggan baru juga mendapatkan kredit gratis senilai $300 untuk menjalankan, menguji, dan men-deploy workload.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project.

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

    Go to project selector

  5. Verify that billing is enabled for your Google Cloud project.

Menginstal alat command line

Dokumen ini menggunakan alat command line berikut:

  • gcloud
  • kubectl

Alat gcloud dan kubectl adalah bagian dari Google Cloud CLI. Untuk mengetahui informasi tentang cara menginstalnya, lihat Mengelola komponen Google Cloud CLI. Untuk melihat komponen gcloud CLI yang telah Anda instal, jalankan perintah berikut:

gcloud components list

Untuk mengonfigurasi gcloud CLI agar dapat digunakan, jalankan perintah berikut:

gcloud auth login
gcloud config set project PROJECT_ID

Mengaktifkan API

Aktifkan Cloud Monitoring API dan Telemetry API di projectGoogle Cloud Anda. Perhatikan secara khusus Telemetry API, telemetry.googleapis.com; dokumen ini mungkin merupakan pertama kalinya Anda menemukan API ini.

Aktifkan API dengan menjalankan perintah berikut:

gcloud services enable monitoring.googleapis.com
gcloud services enable telemetry.googleapis.com

Membuat cluster

Membuat cluster GKE.

  1. Membuat cluster Google Kubernetes Engine bernama otlp-test dengan menjalankan perintah berikut:

    gcloud container clusters create-auto --location CLUSTER_LOCATION otlp-test --project PROJECT_ID
    
  2. Setelah cluster dibuat, hubungkan ke cluster dengan menjalankan perintah berikut:

    gcloud container clusters get-credentials otlp-test --region CLUSTER_LOCATION --project PROJECT_ID
    

Memberi otorisasi akun layanan Kubernetes

Perintah berikut memberikan peran Identity and Access Management (IAM) yang diperlukan ke akun layanan Kubernetes. Perintah ini mengasumsikan bahwa Anda menggunakan Workload Identity Federation for GKE:

export PROJECT_NUMBER=$(gcloud projects describe PROJECT_ID --format="value(projectNumber)")

gcloud projects add-iam-policy-binding projects/PROJECT_ID \
  --role=roles/logging.logWriter \
  --member=principal://iam.googleapis.com/projects/$PROJECT_NUMBER/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/subject/ns/opentelemetry/sa/opentelemetry-collector \
  --condition=None

gcloud projects add-iam-policy-binding projects/PROJECT_ID \
  --role=roles/monitoring.metricWriter \
  --member=principal://iam.googleapis.com/projects/$PROJECT_NUMBER/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/subject/ns/opentelemetry/sa/opentelemetry-collector \
  --condition=None

gcloud projects add-iam-policy-binding projects/PROJECT_ID \
  --role=roles/telemetry.tracesWriter \
  --member=principal://iam.googleapis.com/projects/$PROJECT_NUMBER/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/subject/ns/opentelemetry/sa/opentelemetry-collector \
  --condition=None

Jika akun layanan Anda memiliki format yang berbeda, Anda dapat menggunakan perintah dalam dokumentasi Google Cloud Managed Service for Prometheus untuk mengizinkan akun layanan, dengan perubahan berikut:

  • Ganti nama akun layanan gmp-test-sa dengan akun layanan Anda.
  • Berikan peran yang ditampilkan dalam kumpulan perintah sebelumnya, bukan hanya peran roles/monitoring.metricWriter.

Deploy OpenTelemetry Collector

Buat konfigurasi pengumpul dengan membuat salinan file YAML berikut dan menempatkannya dalam file bernama collector.yaml. Anda juga dapat menemukan konfigurasi berikut di GitHub di repositori otlp-k8s-ingest.

Di salinan Anda, pastikan untuk mengganti kemunculan ${GOOGLE_CLOUD_PROJECT} dengan project ID Anda, PROJECT_ID.

OTLP untuk metrik Prometheus hanya berfungsi saat menggunakan OpenTelemetry Collector versi 0.140.0 atau yang lebih baru.

# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

exporters:
  # The googlecloud exporter is used for logs
  googlecloud:
    log:
      default_log_name: opentelemetry-collector
    user_agent: Google-Cloud-OTLP manifests:0.4.0 OpenTelemetry Collector Built By Google/0.128.0 (linux/amd64)
  googlemanagedprometheus:
    user_agent: Google-Cloud-OTLP manifests:0.4.0 OpenTelemetry Collector Built By Google/0.128.0 (linux/amd64)
  # The otlphttp exporter is used to send traces to Google Cloud Trace and
  # metrics to Google Managed Prometheus using OTLP http/proto.
  # The otlp exporter could also be used to send them using OTLP grpc
  otlphttp:
    encoding: proto
    endpoint: https://telemetry.googleapis.com
    # Use the googleclientauth extension to authenticate with Google credentials
    auth:
      authenticator: googleclientauth


extensions:
  # Standard for the collector. Used for probes.
  health_check:
    endpoint: ${env:MY_POD_IP}:13133
  # This is an auth extension that adds Google Application Default Credentials to http and gRPC requests.
  googleclientauth:


processors:
  # This filter is a standard part of handling the collector's self-observability metrics. Not related to OTLP ingestion.
  filter/self-metrics:
    metrics:
      include:
        match_type: strict
        metric_names:
        - otelcol_process_uptime
        - otelcol_process_memory_rss
        - otelcol_grpc_io_client_completed_rpcs
        - otelcol_googlecloudmonitoring_point_count

  # The recommended batch size for the OTLP endpoint is 200 metric data points.
  batch:
    send_batch_max_size: 200
    send_batch_size: 200
    timeout: 5s

  # The k8sattributes processor adds k8s resource attributes to metrics based on the source IP that sent the metrics to the collector.
  # k8s attributes are important for avoiding errors from timeseries "collisions".
  # These attributes help distinguish workloads from each other, and provide useful metadata (e.g. namespace) when querying.
  k8sattributes:
    extract:
      metadata:
      - k8s.namespace.name
      - k8s.deployment.name
      - k8s.statefulset.name
      - k8s.daemonset.name
      - k8s.cronjob.name
      - k8s.job.name
      - k8s.replicaset.name
      - k8s.node.name
      - k8s.pod.name
      - k8s.pod.uid
      - k8s.pod.start_time
    passthrough: false
    pod_association:
    - sources:
      - from: resource_attribute
        name: k8s.pod.ip
    - sources:
      - from: resource_attribute
        name: k8s.pod.uid
    - sources:
      - from: connection

  # Standard processor for gracefully degrading when overloaded to prevent OOM.
  memory_limiter:
    check_interval: 1s
    limit_percentage: 65
    spike_limit_percentage: 20

  # Standard processor for enriching self-observability metrics. Unrelated to OTLP ingestion.
  metricstransform/self-metrics:
    transforms:
    - action: update
      include: otelcol_process_uptime
      operations:
      - action: add_label
        new_label: version
        new_value: Google-Cloud-OTLP manifests:0.4.0 OpenTelemetry Collector Built By Google/0.128.0 (linux/amd64)

  # The resourcedetection processor, similar to the k8sattributes processor, enriches metrics with important metadata.
  # The gcp detector provides the cluster name and cluster location.
  resourcedetection:
    detectors: [gcp]
    timeout: 10s

  # This transform processor avoids ingestion errors if metrics contain attributes with names that are reserved for the prometheus_target resource.
  transform/collision:
    metric_statements:
    - context: datapoint
      statements:
      - set(attributes["exported_location"], attributes["location"])
      - delete_key(attributes, "location")
      - set(attributes["exported_cluster"], attributes["cluster"])
      - delete_key(attributes, "cluster")
      - set(attributes["exported_namespace"], attributes["namespace"])
      - delete_key(attributes, "namespace")
      - set(attributes["exported_job"], attributes["job"])
      - delete_key(attributes, "job")
      - set(attributes["exported_instance"], attributes["instance"])
      - delete_key(attributes, "instance")
      - set(attributes["exported_project_id"], attributes["project_id"])
      - delete_key(attributes, "project_id")

  # The relative ordering of statements between ReplicaSet & Deployment and Job & CronJob are important.
  # The ordering of these controllers is decided based on the k8s controller documentation available at
  # https://kubernetes.io/docs/concepts/workloads/controllers.
  # The relative ordering of the other controllers in this list is inconsequential since they directly
  # create pods.
  transform/aco-gke:
    metric_statements:
    - context: datapoint
      statements:
      - set(attributes["top_level_controller_type"], "ReplicaSet") where resource.attributes["k8s.replicaset.name"] != nil
      - set(attributes["top_level_controller_name"], resource.attributes["k8s.replicaset.name"]) where resource.attributes["k8s.replicaset.name"] != nil
      - set(attributes["top_level_controller_type"], "Deployment") where resource.attributes["k8s.deployment.name"] != nil
      - set(attributes["top_level_controller_name"], resource.attributes["k8s.deployment.name"]) where resource.attributes["k8s.deployment.name"] != nil
      - set(attributes["top_level_controller_type"], "DaemonSet") where resource.attributes["k8s.daemonset.name"] != nil
      - set(attributes["top_level_controller_name"], resource.attributes["k8s.daemonset.name"]) where resource.attributes["k8s.daemonset.name"] != nil
      - set(attributes["top_level_controller_type"], "StatefulSet") where resource.attributes["k8s.statefulset.name"] != nil
      - set(attributes["top_level_controller_name"], resource.attributes["k8s.statefulset.name"]) where resource.attributes["k8s.statefulset.name"] != nil
      - set(attributes["top_level_controller_type"], "Job") where resource.attributes["k8s.job.name"] != nil
      - set(attributes["top_level_controller_name"], resource.attributes["k8s.job.name"]) where resource.attributes["k8s.job.name"] != nil
      - set(attributes["top_level_controller_type"], "CronJob") where resource.attributes["k8s.cronjob.name"] != nil
      - set(attributes["top_level_controller_name"], resource.attributes["k8s.cronjob.name"]) where resource.attributes["k8s.cronjob.name"] != nil
  # For each Prometheus unknown-typed metric, which is a gauge, create a counter that is an exact copy of this metric.
  # The GCP OTLP endpoint will add appropriate the appropriate suffixes for the counter and gauge.
  transform/unknown-counter:
    metric_statements:
    - context: metric
      statements:
      # Copy the unknown metric, but add a suffix so we can distinguish the copy from the original.
      - copy_metric(Concat([metric.name, "unknowncounter"], ":")) where metric.metadata["prometheus.type"] == "unknown" and not HasSuffix(metric.name, ":unknowncounter")
      # Change the copy to a monotonic, cumulative sum.
      - convert_gauge_to_sum("cumulative", true) where HasSuffix(metric.name, ":unknowncounter")
      # Delete the extra suffix once we are done.
      - set(metric.name, Substring(metric.name, 0, Len(metric.name)-Len(":unknowncounter"))) where HasSuffix(metric.name, ":unknowncounter")

  # When sending telemetry to the GCP OTLP endpoint, the gcp.project_id resource attribute is required to be set to your project ID.
  resource/gcp_project_id:
    attributes:
    - key: gcp.project_id
      # MAKE SURE YOU REPLACE THIS WITH YOUR PROJECT ID
      value: ${GOOGLE_CLOUD_PROJECT}
      action: insert
  # The metricstarttime processor is important to include if you are using the prometheus receiver to ensure the start time is set properly.
  # It is a no-op otherwise.
  metricstarttime:
    strategy: subtract_initial_point

receivers:
  # This collector is configured to accept OTLP metrics, logs, and traces, and is designed to receive OTLP from workloads running in the cluster.
  otlp:
    protocols:
      grpc:
        endpoint: ${env:MY_POD_IP}:4317
      http:
        cors:
          allowed_origins:
          - http://*
          - https://*
        endpoint: ${env:MY_POD_IP}:4318

  # Push the collector's own self-observability metrics to the otlp receiver.
  otlp/self-metrics:
    protocols:
      grpc:
        endpoint: ${env:MY_POD_IP}:14317

service:
  extensions:
  - health_check
  - googleclientauth
  pipelines:
    # Recieve OTLP logs, and export logs using the googlecloud exporter.
    logs:
      exporters:
      - googlecloud
      processors:
      - k8sattributes
      - resourcedetection
      - memory_limiter
      - batch
      receivers:
      - otlp
    # Recieve OTLP metrics, and export metrics to GMP using the otlphttp exporter.
    metrics/otlp:
      exporters:
      - otlphttp
      processors:
      - k8sattributes
      - memory_limiter
      - resource/gcp_project_id
      - resourcedetection
      - transform/collision
      - transform/aco-gke
      - transform/unknown-counter
      - metricstarttime
      - batch
      receivers:
      - otlp
    # Scrape self-observability Prometheus metrics, and export metrics to GMP using the otlphttp exporter.
    metrics/self-metrics:
      exporters:
      - otlphttp
      processors:
      - filter/self-metrics
      - metricstransform/self-metrics
      - k8sattributes
      - memory_limiter
      - resource/gcp_project_id
      - resourcedetection
      - batch
      receivers:
      - otlp/self-metrics
    # Recieve OTLP traces, and export traces using the otlphttp exporter.
    traces:
      exporters:
      - otlphttp
      processors:
      - k8sattributes
      - memory_limiter
      - resource/gcp_project_id
      - resourcedetection
      - batch
      receivers:
      - otlp
  telemetry:
    logs:
      encoding: json
    metrics:
      readers:
      - periodic:
          exporter:
            otlp:
              protocol: grpc
              endpoint: ${env:MY_POD_IP}:14317

Mengonfigurasi OpenTelemetry Collector yang di-deploy

Konfigurasi deployment pengumpul dengan membuat resource Kubernetes.

  1. Buat namespace opentelemetry dan buat konfigurasi pengumpul di namespace dengan menjalankan perintah berikut:

    kubectl create namespace opentelemetry
    
    kubectl create configmap collector-config -n opentelemetry --from-file=collector.yaml
    
  2. Konfigurasi pengumpul dengan resource Kubernetes dengan menjalankan perintah berikut:

    kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/otlp-k8s-ingest/refs/heads/otlpmetric/k8s/base/2_rbac.yaml
    
    kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/otlp-k8s-ingest/refs/heads/otlpmetric/k8s/base/3_service.yaml
    
    kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/otlp-k8s-ingest/refs/heads/otlpmetric/k8s/base/4_deployment.yaml
    
    kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/otlp-k8s-ingest/refs/heads/otlpmetric/k8s/base/5_hpa.yaml
    
  3. Tunggu hingga pod pengumpul mencapai "Running" dan memiliki 1/1 container yang siap. Hal ini memerlukan waktu sekitar tiga menit di Autopilot, jika ini adalah workload pertama yang di-deploy. Untuk memeriksa pod, gunakan perintah berikut:

    kubectl get po -n opentelemetry -w
    

    Untuk berhenti melihat status pod, masukkan Ctrl-C untuk menghentikan perintah.

  4. Anda juga dapat memeriksa log pengumpul untuk memastikan tidak ada error yang jelas:

    kubectl logs -n opentelemetry deployment/opentelemetry-collector
    

Men-deploy generator telemetri

Anda dapat menguji konfigurasi menggunakan alat telemetrygen open source. Aplikasi ini membuat telemetri dan mengirimkannya ke pengumpul.

  1. Untuk men-deploy aplikasi telemetrygen di namespace opentelemetry-demo, jalankan perintah berikut:

    kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/otlp-k8s-ingest/refs/heads/main/sample/app.yaml
    
  2. Setelah Anda membuat deployment, mungkin perlu waktu beberapa saat hingga pod dibuat dan mulai berjalan. Untuk memeriksa status pod, jalankan perintah berikut:

    kubectl get po -n opentelemetry-demo -w
    

    Untuk berhenti melihat status pod, masukkan Ctrl-C untuk menghentikan perintah.

Mengueri metrik menggunakan Metrics Explorer

Alat telemetrygen menulis ke metrik yang disebut gen. Anda dapat membuat kueri metrik ini dari antarmuka pembuat kueri dan editor kueri PromQL di Metrics Explorer.

Di konsol Google Cloud , buka halaman  Metrics explorer:

Buka Metrics explorer

Jika Anda menggunakan kotak penelusuran untuk menemukan halaman ini, pilih hasil yang subjudulnya adalah Monitoring.

  • Jika Anda menggunakan antarmuka pembuat kueri Metrics Explorer, maka nama lengkap metrik adalah prometheus.googleapis.com/gen/gauge.
  • Jika menggunakan editor kueri PromQL, Anda dapat membuat kueri metrik dengan menggunakan nama gen.

Gambar berikut menunjukkan diagram metrik gen di Metrics Explorer:

Diagram menampilkan metrik gen, yang diambil oleh eksportir otlphttp.

Menghapus cluster

Setelah memverifikasi deployment dengan membuat kueri metrik, Anda dapat menghapus cluster. Untuk menghapus cluster, jalankan perintah berikut:

gcloud container clusters delete --location CLUSTER_LOCATION otlp-test --project PROJECT_ID

Langkah berikutnya