Performance considerations

This page provides guidance on configuring your Google Cloud Managed Lustre environment to obtain the best performance.

Performance specifications

The following performance numbers are approximate maximum values.

IOPS

Maximum IOPS scale linearly per TiB of provisioned instance capacity.

Throughput Tier Read IOPS (per TiB) Write IOPS (per TiB)
125 MBps per TiB 725 700
250 MBps per TiB 1,450 1,400
500 MBps per TiB 2,900 2,800
1000 MBps per TiB 5,800 5,600

Metadata operations

Maximum metadata operations increase in steps per 72 GBps of provisioned throughput.

File stats File creates File deletes
Per 72 GBps 410,000 per second 115,000 per second 95,000 per second

Performance after increasing capacity

Increasing the storage capacity of an existing instance increases its maximum throughput and IOPS, and possibly its metadata performance.

Read throughput performance gradually improves as new data is written and redistributed across the additional storage. Write throughput performance increases immediately.

VPC network maximum transmission unit (MTU)

When creating your VPC network, setting the value of mtu (maximum transmission unit, or the size of the largest IP packet that can be transmitted on this network) to the maximum allowed value of 8896 improves performance up to 10% compared to the default value of 1460 bytes.

You can see the current MTU value of your network with the following command:

gcloud compute networks describe NETWORK_NAME --format="value(mtu)"

The MTU value of a network can be updated after the network has been created, but there are important considerations. See Change the MTU of a network for details.

Compute Engine machine types

Network throughput can be affected by your choice of machine type. In general, to obtain the best throughput:

  • Increase the number of vCPUs. Per-instance maximum egress bandwidth is generally 2 Gbps per vCPU, up to the machine type maximum.
  • Select a machine series that supports higher ingress and egress limits. For example, C2 instances with Tier_1 networking support up to 100 Gbps egress bandwidth. C3 instances with Tier_1 networking support up to 200 Gbps.
  • Enable per VM Tier_1 networking performance with larger machine types.
  • Use Google Virtual NIC (gVNIC). gVNIC is the only option for Generation 3 and newer machine types. gVNIC is required when using Tier_1 networking.

For detailed information, refer to Network bandwidth.

Measuring single-client performance

To test read and write performance from a single Compute Engine client, use the fio (Flexible I/O tester) command line tool.

  1. Install fio:

    Rocky 8

    sudo dnf install fio -y
    

    Ubuntu 20.04 and 22.04

    sudo apt update
    sudo install fio
    
  2. Run the following command:

    fio --ioengine=libaio --filesize=32G --ramp_time=2s \
    --runtime=5m --numjobs=16 --direct=1 --verify=0 --randrepeat=0 \
    --group_reporting --directory=/lustre --buffer_compress_percentage=50 \
    --name=read --blocksize=1m --iodepth=64 --readwrite=read
    

The test takes approximately 5 minutes to complete. When finished, the results are displayed. Depending on your configuration, you can expect throughput up to your VM's maximum network speed, and thousands of IOPS per TiB.