Volume performance sizing

This page describes volume performance sizing.

Importance of performance sizing

To correctly size your workloads for performance, you need to understand:

  • How much performance a single volume can deliver.

  • How to adjust a volume's performance.

  • How performance depends mainly on the service level of the underlying storage pool.

Flex Unified and Flex File custom performance

Consider the following for Flex Unified and Flex File custom performance:

  • Performance is shared: the underlying storage pool provides the performance. All volumes in a Flex Unified or Flex File custom pool share the pool's total performance. Smaller volumes can use unused performance from larger ones. This applies to both default and ONTAP-mode.

  • Configurable performance: you can set the pool's performance independently of its capacity.

    • Default: 64 MiBps throughput and 1,024 IOPS per pool.

    • Throughput scalability: increase throughput up to 5 GiBps in 1 MiBps increments. Each extra MiBps adds 16 IOPS.

    • IOPS: provision up to 160,000 IOPS per pool.

  • Large capacity pools: the throughput can reach up to 24 GiBps.

  • Limits: the effective performance of the pool is capped by whichever limit, either throughput or IOPS, is reached first, depending on your application's block size.

Flex File default performance

Consider the following for Flex File default performance:

  • Performance is shared: the underlying storage pool provides the performance. All volumes in the pool share its performance.

  • Similar to Flex Unified and Flex File custom performance, the block size determines which limit applies first: throughput or IOPS.

    • Throughput: 16 KiBps per GiB of pool capacity up to a maximum of 1.6 GiBps.

    • IOPS: 1,024 IOPS per TiB of pool capacity up to a maximum of 60,000 IOPS.

Standard, Premium, and Extreme performance

For Standard, Premium, and Extreme service level volumes, the maximum throughput a volume can sustain is determined by its capacity and the service level of the storage pool that hosts it. You can increase or decrease your volume's maximum throughput by changing its capacity, or for Premium and Extreme service levels, by re-assigning it to a storage pool with a different service level.

The following throughput and IOPS limits assume large sequential reads. Small I/Os or writes reach lower limits. For more information, see Performance benchmarks.

  • Performance scales with volume size and service level.

    • Standard: 16 MiBps per TiB volume capacity up to a maximum of 1.6 GiBps.

    • Premium: 64 MiBps per TiB volume capacity up to a maximum of 5 GiBps per volume. 30 GiBps for large capacity volumes.

    • Extreme: 128 MiBps per TiB volume capacity up to a maximum of 5 GiBps per volume. 30 GiBps for large capacity volumes.

  • Linear scaling: the throughput increases with volume size until it reaches the service level maximum.

  • Adjusting performance: to improve performance, you can increase the volume capacity or move to a higher service level, such as Premium or Extreme. For more control, use Manual QoS to allocate pool performance to specific volumes.

Workload considerations

The volume performance sizing section describes the maximum performance a volume can deliver. Actual application performance depends on how the application performs I/O operations to the volume.

The key factors that determine application performance include:

  • Workload mix: reads, writes, metadata operations; sequential versus random access.

  • Block size: small blocks result in higher IOPS, and large blocks result in higher throughput. Use larger block sizes (64 KiB or more) for better efficiency.

  • Latency: lower network latency improves performance.

  • I/O concurrency: more parallel I/O operations increase performance.

  • Access protocol: the choice of protocol NFSv3, NFSv4, SMB, or iSCSI can affect performance.

  • Client VM cache: increasing the VM buffer cache can reduce read operations.

The following are the key formulas:

  • IOPS = concurrency / latency

  • Throughput = IOPS * block size

The following examples show how throughput and IOPS are calculated:

Volume throughput example

For a volume with the Premium service level and a capacity of 1,500 GiB, the maximum large sequential read throughput achievable with a concurrency of 8 is calculated using the following formula. For Premium volumes, throughput scales linearly with the volume capacity until it reaches its limit.

(1,500 GiB x 64 KiBps/GiB) / 1,024 KiB/MiB = 93.75 MiBps

Throughput and IOPS example

Consider a scenario where a user copies a large file using a single-threaded copy (concurrency = 1) in Windows File Explorer. The file is being moved from a local SSD to a 4 TiB Extreme volume, which has a 512 MiBps throughput limit. Assuming the Windows File Explorer uses a 128 KiB block size and the volume has a latency of 0.5 ms, the throughput and IOPS can be calculated using the following formula:

IOPS = 1/0.0005s = 2,000 IOPS

Throughput = 2,000 IOPS * 128 KiB = 256,000 KiBps = 250 MiBps

In this example, the File Explorer isn't capable of driving the throughput to reach the volume limit (512 MiBps). Additionally, if latency is one millisecond, throughput drops by 50% because latency directly impacts single-threaded applications. To drive this volume to its maximum performance potential, use multi-threaded applications that provide higher concurrency.

Metadata operations

Metadata operations are small, protocol-specific operations. Metadata operation performance is primarily limited by latency. Examples of metadata operations include the following:

  • List contents of a folder

  • Delete a file

  • Set permissions

Latency

Latency is the total amount of time it takes for an I/O operation to complete. This includes the wait time in queue and the service time where the I/O is acted upon. To improve your latency, we recommend that you test the connection to NetApp Volumes from all the zones in your region and select the zone with the lowest latency.

Considerations

  • When a client's network bandwidth is smaller than required, the client latency reported by perfmon in Windows or nfsiostat in Linux is higher than the latency reported by NetApp Volumes because the I/O operation spends time queueing on the client.

  • Storage latency becomes high when a volume's throughput ceiling is lower than required for a given workload. This also causes the client latency to be higher because of the additional client-side queuing.

  • When the volume's throughput ceiling is reached, you can improve the client and storage latencies by increasing the throughput limit.

What's next

Read about storage pools.