Supported machine series for Cluster Director

This document describes the CPU-based machines and GPU-accelerated machines that are available in Cluster Director. You use these machines to deploy and manage the underlying infrastructure to run high performance computing (HPC), artificial intelligence (AI), or machine learning (ML) workloads.

CPU-based machines

Cluster Director supports general-purpose machine series, which are primarily CPU-based and suitable for HPC workloads and cluster management roles, such as login nodes.

N2 machine series

Cluster Director supports the N2 machine series. This series offers a balance of price and performance suitable for a variety of workloads, such as CPU-bound computational tasks or general-purpose HPC workloads.

The N2 machine types are differentiated by the amount of memory configured per vCPU. These memory configurations are as follows:

  • standard: 4 GB of system memory per vCPU

  • highmem: 8 GB of system memory per vCPU

  • highcpu: 1 GB of system memory per vCPU

Each family offers a range of machine sizes, from 2 to 128 vCPUs. For login nodes, you can only use N2 standard machine types with 32 or fewer vCPUs. Otherwise, you can use any N2 machine type. The following table summarizes the available N2 machine types in Cluster Director:

N2 standard

Machine types vCPUs* Memory (GB) Default egress bandwidth (Gbps) Tier_1 egress bandwidth (Gbps)#
n2-standard-2 2 8 Up to 10 N/A
n2-standard-4 4 16 Up to 10 N/A
n2-standard-8 8 32 Up to 16 N/A
n2-standard-16 16 64 Up to 32 N/A
n2-standard-32 32 128 Up to 32 Up to 50
n2-standard-48 48 192 Up to 32 Up to 50
n2-standard-64 64 256 Up to 32 Up to 75
n2-standard-80 80 320 Up to 32 Up to 100
n2-standard-96 96 384 Up to 32 Up to 100
n2-standard-128 128 512 Up to 32 Up to 100

* A vCPU is implemented as a single hardware thread, or logical core, on one of the available CPU platforms.
Maximum egress bandwidth cannot exceed the number given. Actual egress bandwidth depends on the destination IP address and other factors. For more information, see Network bandwidth.
# Supports high-bandwidth networking for larger machine types. For Windows OS images, the maximum network bandwidth is limited to 50 Gbps.

N2 high-mem

Machine types vCPUs* Memory (GB) Default egress bandwidth (Gbps) Tier_1 egress bandwidth (Gbps)#
n2-highmem-2 2 16 Up to 10 N/A
n2-highmem-4 4 32 Up to 10 N/A
n2-highmem-8 8 64 Up to 16 N/A
n2-highmem-16 16 128 Up to 32 N/A
n2-highmem-32 32 256 Up to 32 Up to 50
n2-highmem-48 48 384 Up to 32 Up to 50
n2-highmem-64 64 512 Up to 32 Up to 75
n2-highmem-80 80 640 Up to 32 Up to 100
n2-highmem-96 96 768 Up to 32 Up to 100
n2-highmem-128 128 864 Up to 32 Up to 100

* A vCPU is implemented as a single hardware thread, or logical core, on one of the available CPU platforms.
Maximum egress bandwidth cannot exceed the number given. Actual egress bandwidth depends on the destination IP address and other factors. For more information, see Network bandwidth.
# Supports high-bandwidth networking for larger machine types. For Windows OS images, the maximum network bandwidth is limited to 50 Gbps.

N2 high-cpu

Machine types vCPUs* Memory (GB) Default egress bandwidth (Gbps) Tier_1 egress bandwidth (Gbps)#
n2-highcpu-2 2 2 Up to 10 N/A
n2-highcpu-4 4 4 Up to 10 N/A
n2-highcpu-8 8 8 Up to 16 N/A
n2-highcpu-16 16 16 Up to 32 N/A
n2-highcpu-32 32 32 Up to 32 Up to 50
n2-highcpu-48 48 48 Up to 32 Up to 50
n2-highcpu-64 64 64 Up to 32 Up to 75
n2-highcpu-80 80 80 Up to 32 Up to 100
n2-highcpu-96 96 96 Up to 32 Up to 100

* A vCPU is implemented as a single hardware thread, or logical core, on one of the available CPU platforms.
Maximum egress bandwidth cannot exceed the number given. Actual egress bandwidth depends on the destination IP address and other factors. For more information, see Network bandwidth.
# Supports high-bandwidth networking for larger machine types. For Windows OS images, the maximum network bandwidth is limited to 50 Gbps.

GPU-accelerated machines

Cluster Director supports a specific set of accelerator-optimized machine series designed for large-scale AI, ML, and HPC workloads.

GPU machine types are available only in specific regions and zones. To review availability, see GPU locations.

A4X Max and A4X machine series

The A4X Max and A4X machine series runs on an exascale platform based on NVIDIA's rack-scale architecture and is optimized for compute and memory-intensive, network-bound ML training and HPC workloads. A4X Max and A4X differ primarily in their GPU and networking components. A4X Max is available only as bare metal instances, which provide direct access to the host server's CPU and memory, without the Compute Engine hypervisor layer.

A4X Max machine types (bare metal)

A4X Max accelerator-optimized machine types use NVIDIA GB300 Grace Blackwell Ultra Superchips (nvidia-gb300) and are ideal for foundation model training and serving. A4X Max machine types are available as bare metal instances.

A4X Max is an exascale platform based on NVIDIA GB300 NVL72. Each machine has two sockets with NVIDIA Grace CPUs with Arm Neoverse V2 cores. These CPUs are connected to four NVIDIA B300 Blackwell GPUs with fast chip-to-chip (NVLink-C2C) communication.

Attached NVIDIA GB300 Grace Blackwell Ultra Superchips
Machine type vCPU count1 Instance memory (GB) Attached Local SSD (GiB) Physical NIC count Maximum network bandwidth (Gbps)2 GPU count GPU memory3
(GB HBM3e)
a4x-maxgpu-4g-metal 144 960 12,000 6 3,600 4 1,116

1A vCPU is implemented as a single hardware hyper-thread on one of the available CPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actual egress bandwidth depends on the destination IP address and other factors. For more information about network bandwidth, see Network bandwidth.
3GPU memory is the memory on a GPU device that can be used for temporary storage of data. It is separate from the instance's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

A4X machine types

A4X accelerator-optimized machine types use NVIDIA GB200 Grace Blackwell Superchips (nvidia-gb200) and are ideal for foundation model training and serving.

A4X is an exascale platform based on NVIDIA GB200 NVL72. Each machine has two sockets with NVIDIA Grace CPUs with Arm Neoverse V2 cores. These CPUs are connected to four NVIDIA B200 Blackwell GPUs with fast chip-to-chip (NVLink-C2C) communication.

Attached NVIDIA GB200 Grace Blackwell Superchips
Machine type vCPU count1 Instance memory (GB) Attached Local SSD (GiB) Physical NIC count Maximum network bandwidth (Gbps)2 GPU count GPU memory3
(GB HBM3e)
a4x-highgpu-4g 140 884 12,000 6 2,000 4 744

1A vCPU is implemented as a single hardware hyper-thread on one of the available CPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actual egress bandwidth depends on the destination IP address and other factors. For more information about network bandwidth, see Network bandwidth.
3GPU memory is the memory on a GPU device that can be used for temporary storage of data. It is separate from the instance's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

A4 machine series

A4 accelerator-optimized machine types have NVIDIA B200 Blackwell GPUs (nvidia-b200) attached and are ideal for foundation model training and serving.

Attached NVIDIA B200 Blackwell GPUs
Machine type vCPU count1 Instance memory (GB) Attached Local SSD (GiB) Physical NIC count Maximum network bandwidth (Gbps)2 GPU count GPU memory3
(GB HBM3e)
a4-highgpu-8g 224 3,968 12,000 10 3,600 8 1,440

1A vCPU is implemented as a single hardware hyper-thread on one of the available CPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actual egress bandwidth depends on the destination IP address and other factors. For more information about network bandwidth, see Network bandwidth.
3GPU memory is the memory on a GPU device that can be used for temporary storage of data. It is separate from the instance's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

A3 machine series

The A3 machine series is powered by NVIDIA H100 or H200 SXM GPUs and is suitable for a wide range of large model training and inference workloads.

Cluster Director supports the A3 machine types that are described in the following sections.

A3 Ultra machine type

A3 Ultra machine types have NVIDIA H200 SXM GPUs (nvidia-h200-141gb) attached and provides the highest network performance in the A3 series. A3 Ultra machine types are ideal for foundation model training and serving.

Attached NVIDIA H200 GPUs
Machine type vCPU count1 Instance memory (GB) Attached Local SSD (GiB) Physical NIC count Maximum network bandwidth (Gbps)2 GPU count GPU memory3
(GB HBM3e)
a3-ultragpu-8g 224 2,952 12,000 10 3,600 8 1128

1A vCPU is implemented as a single hardware hyper-thread on one of the available CPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actual egress bandwidth depends on the destination IP address and other factors. For more information about network bandwidth, see Network bandwidth.
3GPU memory is the memory on a GPU device that can be used for temporary storage of data. It is separate from the instance's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

A3 Mega machine type

A3 Mega machine types have NVIDIA H100 SXM GPUs and are ideal for large model training and multi-host inference.
Attached NVIDIA H100 GPUs
Machine type vCPU count1 Instance memory (GB) Attached Local SSD (GiB) Physical NIC count Maximum network bandwidth (Gbps)2 GPU count GPU memory3
(GB HBM3)
a3-megagpu-8g 208 1,872 6,000 9 1,800 8 640

1A vCPU is implemented as a single hardware hyper-thread on one of the available CPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actual egress bandwidth depends on the destination IP address and other factors. For more information about network bandwidth, see Network bandwidth.
3GPU memory is the memory on a GPU device that can be used for temporary storage of data. It is separate from the instance's memory and is specifically designed to handle the higher bandwidth demands of your graphics-intensive workloads.

What's next?