A cluster blueprint is a YAML file that defines a reusable configuration and
describes the specific cluster that you want to deploy using Cluster Toolkit.
To configure your cluster, you can either start with one of the cluster blueprint examples which you can modify, or create your own blueprint. To create your own blueprint, review the Design a cluster blueprint section for an overview of the configurations that you need to specify in your blueprint.
Before you deploy a cluster, ensure to review the quota requirements.
Design a cluster blueprint
A cluster blueprint is comprised of the following three main components:
- Blueprint name. The name of the blueprint. When naming your cluster blueprint, use the following conventions: - If you are updating or modifying an existing configuration, don't change the blueprint name.
- If you are creating a new configuration, specify a new unique blueprint name.
 - The blueprint name is added as a label to your cloud resources and is used for tracking usage and monitoring costs. - The blueprint name is set using the - blueprint_namefield.
- Deployment variables. A set of parameters that are used by all modules in the blueprint. Use these variables to set values that are specific to a deployment. - Deployment variables are set using the - varsfield in the blueprint, but you can override or set deployment variables at deployment time by specifying the- --varsflag with the- gclustercommand.- The most common deployment variables are as follows: - deployment_name: the name of the deployment. The- deployment_nameis a required variable for a deployment.- This variable must be set to a unique value any time you deploy a new copy of an existing blueprint. The deployment name is added as a label to cloud resources and is used for tracking usage and monitoring costs. - Because a single cluster blueprint can be used for multiple deployments, you can use the - blueprint_nameto identify the type of environment, for example- slurm-high-performance-cluster. While the- deployment_namecan be used to identify the targeted use of that cluster, for example- research-dept-prod.
- project_id: the ID for the project where you want to deploy the cluster. The- project_idis a required variable for a deployment.
- zone: the zone where you want to deploy the cluster.
- region: the region where you want to deploy the cluster.
 - Other variables that you might want to specify here include a custom image family, a Shared VPC network, or subnetwork that you want all modules to use. 
- Deployment groups. Defines a distinct set of modules that are to be deployed together. A deployment group can only contain modules of a single type, for example a deployment group can't mix Packer and Terraform modules. - Deployment groups are set using the - deployment_groupsfield. Each deployment group requires the following parameters:- group: the name of the group.
- modules: the descriptors for each module, these include the following:- id: a unique identifier for the module.
- source: the directory path or URL where the module is located. For more information, see Module fields.
- kind: the type of module. Valid values are- packeror- terraform. This is an optional parameter that defaults to- terraformif omitted.
- use: a list of module IDs whose outputs can be linked to the module's settings. This is an optional parameter.
- outputs: If you are using Terraform modules, use this parameter to specify a list of Terraform output values that you want to make available at the deployment group level.- During deployment, these output values are printed to the screen after you run the - terraform applycommand.- After deployment, you can access these outputs by running the - terraform outputcommand.- This is an optional parameter. 
- settings: any module variable that you want to add. This is an optional parameter.
 - For a list of supported modules, see Supported modules. 
 
- Terraform Remote State configuration (optional). Most blueprints use Terraform modules to provision Cloud infrastructure. It is recommended to use Terraform remote state backed by a Cloud Storage bucket configured with object versioning. All configuration settings of the Cloud Storage backend are supported. The - prefixsetting determines the path within a bucket where state is stored. If- prefixis left unset, the Cluster Toolkit automatically generates a unique value based upon the- blueprint_name,- deployment_name, and deployment group name. The following configuration enables remote state for all deployment groups in a blueprint:- terraform_backend_defaults: type: gcs configuration: bucket: BUCKET_NAME - For more information about advanced Terraform remote state configuration, see the Cluster Toolkit GitHub repository. 
- Blueprint Versioning (optional). Defines the GitHub repository URL and the specific version to use for any modules that use an embedded module path. Embedded module paths include the following: - source: modules/
- source: community/modules/
 - Providing the GitHub repository URL and the module version ensures that the module is pulled from a specific GitHub repository and uses a specific version. This provides more control and flexibility over the modules used in your deployments. To version a blueprint you must set both of the following parameters: - toolkit_modules_url: specifies the base URL of the GitHub repository containing the modules.
- toolkit_modules_version: specifies the version of the modules to use.
 - When these parameters are specified, the blueprint processing logic modifies the source field of any module that references an embedded module path. In the deployment folder, any reference to an embedded module path is replaced with a GitHub URL that includes the specified repository, version, and module path. - For example if you set the following: - toolkit_modules_url: github.com/GoogleCloudPlatform/cluster-toolkit toolkit_modules_version: v1.38.0 - A module that is specified as - source: modules/compute/vm-instanceis updated to the following in the deployment folder:- source: github.com/GoogleCloudPlatform/cluster-toolkit//modules/compute/vm-instance?ref=v1.38.0&depth=1
Cluster blueprint examples
To get started, you can use one of the following cluster blueprint examples.
- Example 1: Deploys a basic HPC cluster with Slurm
- Example 2: Deploys an HPC cluster with Slurm and a tiered file system
For a full list of example cluster blueprints, see the Cluster Toolkit GitHub repository.
Example 1
Deploys a basic autoscaling cluster with Slurm that uses default
        settings. The blueprint also creates a new VPC network, and a
       filestore instance mounted to /home.
Example 2
Deploys a cluster with Slurm that has a tiered file systems for higher performance. It connects to the default Virtual Private Cloud of the project and creates seven partitions and a login node.
Request additional quotas
You might need to request additional quota to be able to deploy and use your cluster.
For example, by default the schedmd-slurm-gcp-v5-node-group module uses
c2-standard-60 VMs for compute nodes. The default quota for C2 VMs might be as
low as 8, which might prevent even a single node from being started.
The required quotas are based on your custom configuration. Minimum quotas are documented on GitHub for the provided example blueprints.
To view and adjust quotas, see View and manage quotas.
What's next
- Set up Cluster Toolkit.
- Review Cluster deployment overview.