Use the gke-persistent-volume module to create Kubernetes PersistentVolume
(PV) and PersistentVolumeClaim
(PVC) resources.
This module connects Cluster Toolkit storage modules to the
gke-job-template.
This module supports the following storage backends:
For the complete list of inputs and outputs, see the gke-persistent-volume
module
in the Cluster Toolkit GitHub repository.
Before you begin
Before you begin, verify that you meet the following requirements:
- You have installed and configured Cluster Toolkit. For installation instructions, see Set up Cluster Toolkit.
- You have an existing cluster blueprint. You can use and modify an existing
blueprint or create one from scratch. For a working example of a blueprint
configured for the
gke-persistent-volumemodule, see theexamples/storage-gke.yamlfile. For more information about creating and customizing blueprints, see Cluster blueprint. - To view a complete list of blueprints that support the
gke-persistent-volumemodule, go to the Cluster blueprint catalog page, click the Select scheduler menu and then select GKE. - The
gke-persistent-volumemodule does not create a continuous long-running workload or a full cluster. It defines KubernetesPersistentVolume(PV) andPersistentVolumeClaim(PVC) resources that connect pre-provisioned storage volumes to your GKE cluster. - The
gke-persistent-volumemodule calls the Kubernetes API to create resources. You must authorize the deployment machine to connect to the Kubernetes API. To authorize the deployment machine, add themaster_authorized_networkssetting to yourgke-clustermodule configuration. You must include the IP address of the deployment machine in this block. This configuration lets the deployment machine connect to the cluster. - Each
gke-persistent-volumemodule defines a single file system. If you need multiple shared file systems, then you must include multiple instances of this module.
Required roles
To get the permissions that you need to create GKE persistent volumes and storage backends, ask your administrator to grant you the following IAM roles on your project:
- Kubernetes Engine Admin (
roles/container.admin) - Compute Network Admin (
roles/compute.networkAdmin) -
If using Filestore:
Cloud Filestore Editor (
roles/file.editor) -
If using Cloud Storage:
Storage Admin (
roles/storage.admin) -
If using Google Cloud Managed Lustre:
Managed Lustre Admin (
roles/lustre.admin)
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
Use with Filestore
The following example creates a Filestore instance and uses the
gke-persistent-volume module to expose the instance as shared storage (mounted
at /data) to a GKE job.
- id: gke_cluster
source: modules/scheduler/gke-cluster
use: [network1]
settings:
master_authorized_networks:
- display_name: deployment-machine
cidr_block: IP_ADDRESS/32
- id: datafs
source: modules/file-system/filestore
use: [network1]
settings:
local_mount: /data
- id: datafs-pv
source: modules/file-system/gke-persistent-volume
use: [datafs, gke_cluster]
- id: job-template
source: modules/compute/gke-job-template
use: [datafs-pv, compute_pool, gke_cluster]
Replace IP_ADDRESS with the IP address of your
deployment machine.
Use with Cloud Storage
The following example creates a Cloud Storage bucket and uses the
gke-persistent-volume module to expose the bucket to the job template.
- id: gke_cluster
source: modules/scheduler/gke-cluster
use: [network1]
settings:
master_authorized_networks:
- display_name: deployment-machine
cidr_block: IP_ADDRESS/32
- id: data-bucket
source: modules/file-system/cloud-storage-bucket
settings:
local_mount: /data
- id: datagcs-pv
source: modules/file-system/gke-persistent-volume
use: [data-bucket, gke_cluster]
- id: job-template
source: modules/compute/gke-job-template
use: [datagcs-pv, compute_pool, gke_cluster]
Use with Google Cloud Managed Lustre
The following example creates a Google Cloud Managed Lustre instance and uses the
gke-persistent-volume module to expose the instance to the job template.
- id: gke_cluster
source: modules/scheduler/gke-cluster
use: [network1]
settings:
master_authorized_networks:
- display_name: deployment-machine
cidr_block: IP_ADDRESS/32
- id: data-managedlustre
source: modules/file-system/managed-lustre
settings:
local_mount: /data
- id: datalustre-pv
source: modules/file-system/gke-persistent-volume
use: [data-managedlustre, gke_cluster]
- id: job-template
source: modules/compute/gke-job-template
use: [datalustre-pv, compute_pool, gke_cluster]
What's next
- To learn how to use these volumes in a job, see the
gke-job-templatemodule. - For a complete list of input fields and output values, see the
gke-persistent-volumemodule on GitHub.