Create a partition for a Slurm controller

Use this module to create a compute partition to use as input when you define the slurm-controller module. A compute partition defines a logical group of compute nodes for your Slurm cluster. This module works alongside the nodeset module to group resources together in a single partition.

This module lets you configure partition-level settings such as job exclusivity, power management, timeouts, and default status.

For the complete list of inputs and outputs, see the schedmd-slurm-gcp-v6-partition module in the Cluster Toolkit GitHub repository.

Before you begin

Before you begin, verify that you meet the following requirements:

  • You have installed and configured Cluster Toolkit. For installation instructions, see Set up Cluster Toolkit.
  • You have an existing cluster blueprint. You can use and modify an existing blueprint or create one from scratch. For a working example of a blueprint configured for the slurm-partition module, see the examples/hpc-slurm.yaml file. For more information about creating and customizing blueprints, see Cluster blueprint.
  • To view a complete list of blueprints that support the slurm-partition module, go to the Cluster blueprint catalog page, click the Select scheduler menu and then select Slurm.
  • In your blueprint, you must have defined at least one slurm-nodeset module to include in the partition.

Required roles

To get the permissions that you need to deploy the partition, ask your administrator to grant you the following IAM roles on your project:

For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

Create a compute partition

To create a compute partition, add the slurm-partition module to your blueprint and specify the nodeset modules that the partition contains.

The following example creates a partition module with the following configuration:

  • Adds two nodesets by using the use field:

    • The first nodeset uses the c2-standard-30 machine type.
    • The second nodeset uses the c2-standard-60 machine type.
    • Both nodesets specify a maximum of 200 dynamic nodes.
  • Sets the partition name to compute.

  • Connects to the network module by using the use field.

  • Mounts homefs by using the use field, which connects the partition to the shared file system module that hosts user home directories.

- id: nodeset_1
  source: community/modules/compute/schedmd-slurm-gcp-v6-nodeset
  use: [network]
  settings:
    name: c30
    node_count_dynamic_max: 200
    machine_type: c2-standard-30

- id: nodeset_2
  source: community/modules/compute/schedmd-slurm-gcp-v6-nodeset
  use: [network]
  settings:
    name: c60
    node_count_dynamic_max: 200
    machine_type: c2-standard-60

- id: compute_partition
  source: community/modules/compute/schedmd-slurm-gcp-v6-partition
  use:
  - homefs
  - nodeset_1
  - nodeset_2
  settings:
    partition_name: compute

Set the default partition

You can set a specific partition as the default for jobs that don't explicitly request one. To do this, set the is_default setting to true.

  settings:
    partition_name: compute
    is_default: true

Configure job exclusivity

By default, the slurm-partition module configures nodes to execute a single job at a time by using the exclusive: true setting. When a job completes, the node is automatically deleted to ensure a clean environment for the next job.

To let multiple jobs to share a node, set the exclusive setting to false:

  settings:
    partition_name: shared-compute
    exclusive: false

Configure node power management

Slurm on Google Cloud automatically suspends idle nodes to save costs. You can configure this behavior by using the following input settings:

  • suspend_time: controls how many seconds a node remains idle before it turns off. The default is 300 seconds (5 minutes). Set this to -1 to prevent nodes from suspending.
  • suspend_timeout: controls the maximum time allowed (in seconds) between when a request to suspend a node is issued and when the node fully shuts down.
  • resume_timeout: controls the maximum time allowed (in seconds) between when a request to start a node is issued and when the node is ready for use.
  settings:
    partition_name: compute
    suspend_time: 600 # Wait 10 minutes before scaling down
    suspend_timeout: 120
    resume_timeout: 300

What's next