Mount Cloud Storage buckets for specific jobs

This document explains how to mount Cloud Storage buckets on your cluster for the duration of a job.

You can mount a bucket automatically for the duration of a Slurm job. This action lets you read and write files directly to Cloud Storage without permanently adding storage resources to your cluster. To permanently add storage resources to your cluster, see instead Modify a cluster.

Before you begin

If you haven't already, then connect to a login node in your cluster. For instructions, see Connect to a cluster's login node.

Required roles

To get the permissions that you need to mount and use Cloud Storage buckets, ask your administrator to grant you the Storage Object User (roles/storage.objectUser) IAM role on the Compute Engine default service account. For more information about granting roles, see Manage access to projects, folders, and organizations.

This predefined role contains the permissions required to mount and use Cloud Storage buckets. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to mount and use Cloud Storage buckets:

To view the details of a file in a Cloud Storage bucket: storage.objects.get
To view a list of files in a Cloud Storage bucket: storage.objects.list
To write files in a Cloud Storage bucket: storage.objects.create
To delete files in a Cloud Storage bucket: storage.objects.delete

You might also be able to get these permissions with custom roles or other predefined roles.

Mount Cloud Storage buckets for the duration of a job

Cluster Director uses Slurm Plug-in Architecture for Node and job (K)control (SPANK) to mount Cloud Storage buckets on your cluster for the duration of a job. This plugin uses Cloud Storage FUSE to manage the lifecycle of the mounted buckets, automatically mounting the buckets when the job starts and unmounting them when the job completes.

When you specify a directory on which to mount buckets, consider the following:

If the directory doesn't exist, then Slurm automatically creates it.
If the directory exists, then it must be empty, you must be the owner of the directory, and the directory must not be the /home directory of your cluster.

To mount one or more buckets for the duration of a job, include the --gcsfuse-mount flag in the salloc, sbatch, or srun commands. Based on the requirements of your workload, use the following syntax:

Use case	Mounting syntax	Example command
Mount a single bucket	`--gcsfuse-mount="BUCKET_NAME:LOCAL_PATH"`	`sbatch --gcsfuse-mount="my-bucket:/data" job.sh`
Mount a read-only bucket	`--gcsfuse-mount="BUCKET_NAME:LOCAL_PATH:--read-only"`	`srun --gcsfuse-mount="my-bucket:/data:--read-only" hostname`
Mount multiple buckets	`--gcsfuse-mount="BUCKET_NAME_1:LOCAL_PATH_1;BUCKET_NAME_2:LOCAL_PATH_2"`	`salloc --gcsfuse-mount="bucket-1:/data1;bucket-2:/data2"`
Mount all buckets in your project	`--gcsfuse-mount=":LOCAL_PATH"`	`sbatch --gcsfuse-mount=":/all-data" job.sh`

To customize how Slurm mounts your buckets, you can also specify additional Cloud Storage FUSE options or mount buckets from within a Slurm script.

Specify additional Cloud Storage FUSE CLI options

To customize how Slurm mounts your buckets, you can pass optional Cloud Storage FUSE CLI options by including them as a space-separated string after a third colon (:) in your command.

For example, to submit a batch job to the queue that includes the --implicit-dirs flag and restricts access to a directory named logs by using the --only-dir flag, run the following command:

sbatch --gcsfuse-mount="my-bucket:/mnt/gcs:--implicit-dirs --only-dir logs" job.sh

For a complete list of supported options, see Cloud Storage FUSE CLI options.

Mount buckets from within a Slurm script

To mount one or more Cloud Storage buckets within a single job, specify your bucket configurations as a semicolon-separated list inside the #SBATCH --gcsfuse-mount directive.

For example, to mount two buckets on compute nodes by using a Slurm script, complete the following steps:

Create a script as follows:
```
#!/bin/bash
#SBATCH --gcsfuse-mount="BUCKET_NAME_1:LOCAL_PATH_1;BUCKET_NAME_2:LOCAL_PATH_2"

# Run your application by using the temporarily mounted directories
python3 PYTHON_SCRIPT --data_dir LOCAL_PATH_1/INPUT_DIRECTORY --output_dir LOCAL_PATH_2/OUTPUT_DIRECTORY
```
Replace the following:
- BUCKET_NAME_1 and BUCKET_NAME_2: the names of the two buckets.
- LOCAL_PATH_1 and LOCAL_PATH_2: the directories on which you want to mount your buckets—for example, /gcs/data1 and /gcs/data2.
- PYTHON_SCRIPT: the name of the Python script that you want to run.
- INPUT_DIRECTORY: the subdirectory in your first bucket that contains your input files—for example, input.
- OUTPUT_DIRECTORY: the subdirectory in your second bucket where you want to write files—for example, output.
Submit your job to the Slurm queue:
```
sbatch JOB_SCRIPT
```
Replace JOB_SCRIPT with the path to the job script that you created in the previous step.

What's next

Monitor your cluster:
- Monitor cluster performance with prebuilt dashboards
- Monitor clusters with custom dashboards or alerts
Manage cluster health
Troubleshoot cluster errors