schedmd-slurm-gcp-v6-nodeset-dynamic module to create an instance
template and a nodeset data structure for dynamic compute nodes. This module
works in conjunction with the schedmd-slurm-gcp-v6-partition
module to let your Slurm cluster
provision resources on demand.
For the complete list of inputs and outputs, see the slurm-nodeset-dynamic
module
in the Cluster Toolkit GitHub repository.
Before you begin
Before you begin, verify that you meet the following requirements:
- You have installed and configured Cluster Toolkit. For installation instructions, see Set up Cluster Toolkit.
- You have an existing cluster blueprint. You can use and modify an existing
blueprint or create one from scratch. For a working example of a blueprint
configured for the
slurm-nodeset-dynamicmodule, see theexamples/hpc-slurm.yamlfile. For more information about creating and customizing blueprints, see Cluster blueprint. - To view a complete list of blueprints that support the
slurm-nodeset-dynamicmodule, go to the Cluster blueprint catalog page, click the Select scheduler menu and then select Slurm.
Required roles
To get the permissions that you need to deploy the dynamic nodeset, ask your administrator to grant you the following IAM roles on your project:
- Compute Instance Admin (v1) (
roles/compute.instanceAdmin.v1) - Service Account User (
roles/iam.serviceAccountUser)
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
Create a dynamic nodeset
To create a dynamic nodeset, add the schedmd-slurm-gcp-v6-nodeset-dynamic
module to your blueprint. You must specify the machine type and connect it to a
partition and controller.
The following example creates a dynamic nodeset and uses it to configure a partition. It also defines a managed instance group (MIG) that uses the instance template created by the dynamic nodeset module:
- id: dynamic_ns
source: community/modules/compute/schedmd-slurm-gcp-v6-nodeset-dynamic
use: [network, controller]
settings:
machine_type: n2-standard-2
- id: dynamic_partition
source: community/modules/compute/schedmd-slurm-gcp-v6-partition
use: [dynamic_ns]
settings:
partition_name: mp
is_default: true
- id: controller
source: community/modules/scheduler/schedmd-slurm-gcp-v6-controller
use: [network, dynamic_partition]
- id: mig
source: community/modules/compute/mig
settings:
versions:
- name: v1
instance_template: $(dynamic_ns.instance_template_self_link)
base_instance_name: $(dynamic_ns.node_name_prefix)
Configure custom images
For information about how to create valid custom images for the node group virtual machine (VM) instances or for custom instance templates, see the documentation on Slurm on Google Cloud custom images on GitHub.
Configure GPU support
To learn more about GPU support in slurm-nodeset-dynamic and other
Cluster Toolkit modules, see the documentation on GPU
support
on GitHub.
What's next
- To add this nodeset to a partition, see Create a partition for a Slurm controller.
- For a complete list of all available input fields and output values, see the
schedmd-slurm-gcp-v6-nodeset-dynamicmodule on GitHub.