"Managed Service for Apache Spark" is the new name for the product formerly known as "Dataproc on Compute Engine" (cluster deployment) and "Google Cloud Serverless for Apache Spark" (serverless deployment).

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

Membuat node pool

Saat Anda membuat atau mengupdate cluster virtual Managed Service untuk Apache Spark di GKE, Anda menentukan satu atau beberapa node pool yang akan digunakan cluster virtual untuk menjalankan tugas (cluster ini disebut sebagai cluster "yang digunakan oleh" atau "terkait" dengan node pool yang ditentukan). Jika node pool yang ditentukan tidak ada di cluster GKE Anda, Managed Service untuk Apache Spark di GKE akan membuat node pool di cluster GKE dengan setelan yang Anda tentukan. Jika node pool ada dan dibuat oleh Managed Service untuk Apache Spark, node pool tersebut akan divalidasi untuk mengonfirmasi bahwa setelannya cocok dengan setelan yang ditentukan.

Setelan kumpulan node Managed Service untuk Apache Spark di GKE

Anda dapat menentukan setelan berikut di node pool yang digunakan oleh cluster virtual Managed Service untuk Apache Spark di GKE (setelan ini adalah subset dari setelan node pool GKE):

accelerators
acceleratorCount
acceleratorType
gpuPartitionSize*
localSsdCount
machineType
minCpuPlatform
minNodeCount
maxNodeCount
preemptible
spot*

Catatan:

gpuPartitionSize dapat ditetapkan di Managed Service for Apache Spark API GkeNodePoolAcceleratorConfig.
spot dapat ditetapkan di Managed Service untuk Apache Spark API GkeNodeConfig.

Penghapusan kumpulan node

Saat cluster Managed Service untuk Apache Spark di GKE dihapus, node pool yang digunakan oleh cluster tidak akan dihapus. Lihat Menghapus node pool untuk menghapus node pool yang tidak lagi digunakan oleh cluster Managed Service for Apache Spark di GKE.

Lokasi node pool

Anda dapat menentukan lokasi zona node pool yang terkait dengan cluster virtual Managed Service untuk Apache Spark di GKE saat Anda membuat atau memperbarui cluster virtual. Zona node pool harus berada di region cluster virtual terkait.

Pemetaan peran ke node pool

Peran node pool ditentukan untuk tugas driver dan executor Spark, dengan peran default yang ditentukan untuk semua jenis tugas oleh node pool. Cluster Managed Service for Apache Spark di GKE harus memiliki setidaknya satu node pool yang diberi peran default. Pemberian peran lain bersifat opsional.

Rekomendasi: Buat node pool terpisah untuk setiap jenis peran, dengan jenis dan ukuran node berdasarkan persyaratan peran.

Contoh pembuatan cluster virtual gcloud CLI:

gcloud dataproc clusters gke create "${DP_CLUSTER}" \
  --region=${REGION} \
  --gke-cluster=${GKE_CLUSTER} \
  --spark-engine-version=latest \
  --staging-bucket=${BUCKET} \
  --pools="name=${DP_POOLNAME},roles=default \
  --setup-workload-identity
  --pools="name=${DP_CTRL_POOLNAME},roles=default,machineType=e2-standard-4" \
  --pools="name=${DP_DRIVER_POOLNAME},min=1,max=3,roles=spark-driver,machineType=n2-standard-4" \
  --pools="name=${DP_EXEC_POOLNAME},min=1,max=10,roles=spark-executor,machineType=n2-standard-8"

Membuat node pool Tetap teratur dengan koleksi Simpan dan kategorikan konten berdasarkan preferensi Anda.

Setelan kumpulan node Managed Service untuk Apache Spark di GKE

Penghapusan kumpulan node

Lokasi node pool

Pemetaan peran ke node pool

Membuat node pool