Create a Batch job template

The batch-job-template module lets you create and configure job templates for Batch in your Cluster Toolkit deployments.

By using this module, you can automate the configuration of Compute Engine resources for your Batch workloads. This module generates a local job template file and a Compute Engine instance template. The instance template defines the specific compute settings for your Batch job, such as the network, machine type, image, and startup script. The job template automatically references this instance template. By separating the job definition from the compute environment, you can deploy complex workloads and customize the generated template before you submit it to the Batch API.

When you use this module with the batch-login-node module, the cluster places the generated job template directly on the login node.

For the complete list of inputs and outputs for this module, see the batch-job-template module page in the Cluster Toolkit GitHub repository.

Before you begin

Before you begin, verify that you meet the following requirements:

  • You have installed and configured Cluster Toolkit. For installation instructions, see Set up Cluster Toolkit.
  • You have an existing cluster blueprint. You can use and modify an existing blueprint or create one from scratch. For a working example of a blueprint configured for the batch-job-template module, see the examples/batch.yaml file. For more information about creating and customizing blueprints, see Cluster blueprint.
  • To view a complete list of blueprints that support the batch-job-template module, go to the Cluster blueprint catalog page, click the Select scheduler menu and then select Batch.

Required roles

To get the permissions that you need to create the Batch job template and instance template, ask your administrator to grant you the following IAM roles on your project:

For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

Create a Batch job template

The following example demonstrates how to configure a basic Batch job. This configuration defines an echo command and assigns a standard machine type. To view the location of the generated file and the instructions for submitting the job, refer to the instructions output.

- id: batch-job
  source: modules/scheduler/batch-job-template
  use: [network1]
  settings:
    runnable: "echo 'hello world'"
    machine_type: n2-standard-4
  outputs: [instructions]

For more complex implementations, such as integrating the batch-job-template module with the filestore module and the startup-script module, see the examples/batch.yaml example on GitHub.

Configure network settings

This module supports newly created Virtual Private Cloud (VPC) networks, pre-existing networks, and Shared VPC networks. To define the network for your Batch jobs, include a network module, such as the pre-existing-vpc module or the vpc module for new Virtual Private Cloud networks, in your blueprint. Then, instruct the batch-job-template module to use that network resource by specifying it in the use keyword.

  - id: shared_network
    source: modules/network/pre-existing-vpc
    settings:
      project_id: host-project-id
      network_name: my-shared-vpc
      subnetwork_name: my-shared-subnet

  - id: batch-job
    source: modules/scheduler/batch-job-template
    use: [shared_network]
    settings:
      runnable: "echo 'hello from a shared vpc'"
      machine_type: n2-standard-4
    outputs: [instructions]

Configure instance templates

Batch jobs rely on instance templates to define settings like the network, machine type, image, and startup script. By default, the batch-job-template module creates an internal instance template and supplies it to the Batch job.

Alternatively, you can supply your own instance template to the module by using the instance_template setting. You can generate this custom instance template outside of Cluster Toolkit, or you can define it within your blueprint by using a separate module. Supplying a custom instance template is useful when you need to configure a property that the batch-job-template module does not natively support.

Generate a custom instance template

The following example demonstrates how to generate a custom instance template by using a Cloud Foundation Toolkit module. The blueprint supplies this custom template to the Batch job.

deployment_groups:
- group: primary
  modules:
  - id: network1
    source: modules/network/pre-existing-vpc

  - id: appfs
    source: modules/file-system/filestore
    use: [network1]

  - id: batch-startup-script
    source: modules/scripts/startup-script
    settings:
      runners: ...

  - id: batch-compute-template
    source: github.com/terraform-google-modules/terraform-google-vm//modules/instance_template?ref=v7.8.0
    use: [batch-startup-script]
    settings:
      # Boiler plate to work with Cloud Foundation Toolkit
      network: $(network1.network_self_link)
      service_account: {email: null, scopes: ["https://www.googleapis.com/auth/cloud-platform"]}
      access_config: [{nat_ip: null, network_tier: null}]
      # Batch customization
      machine_type: n2-standard-4
      metadata:
        network_storage: ((jsonencode([module.appfs.network_storage])))
      source_image_family: hpc-rocky-linux-8
      source_image_project: cloud-hpc-image-public

  - id: batch-job
    source: modules/scheduler/batch-job-template
    settings:
      instance_template: $(batch-compute-template.self_link)
    outputs: [instructions]

What's next

  • For the complete list of inputs and outputs for this module, see the batch-job-template module page in the Cluster Toolkit GitHub repository.
  • Learn how to create a login node to test and submit your jobs by using the batch-login-node module.