Rapid Bucket

This page describes Rapid Bucket, a capability that lets you store objects in the Rapid storage class by setting a zone as a bucket's location. This approach enables you to colocate your data storage with your compute resources, which delivers significantly lower latency and higher throughput compared to other storage classes in Cloud Storage. Workloads in other zones and regions can also access the bucket, with performance relative to the network distance. Rapid Bucket is ideal to use for your most data-intensive applications, such as AI/ML and data analytics.

To create a zonal bucket using Rapid Bucket, see Create zonal buckets. To read and append to objects in zonal buckets, see use objects in zonal buckets.

Rapid Bucket terminology

The Cloud Storage documentation uses the following terms:

  • Rapid Bucket: The product that enables buckets to be created with a zonal location and the Rapid storage class.

  • Rapid storage: The storage class that offers the highest data access and I/O operation performance in Cloud Storage. When you use Rapid Bucket, you create a bucket that uses Rapid storage. For more information about Rapid storage, see Storage classes.

  • Zonal bucket: A bucket that's located in a zone. Objects in zonal buckets are always stored in Rapid storage and are appendable.

Capabilities of zonal buckets

In addition to providing low latency and high throughput, zonal buckets enable you to do the following:

  • Append to objects in the zonal bucket without performing a full object rewrite

  • Open objects and maintain a stream as you perform operations, letting you accelerate subsequent reads and writes

Use cases

Rapid Bucket is most suitable for AI/ML workloads or other data-intensive workloads. Some examples of such workloads are model checkpointing, evaluation, and serving, as well as logging and messaging queues. It can also be used for streaming data or to provide storage for databases.

To take full advantage of the low latency and high throughput provided by Rapid Bucket, make sure to enable gRPC direct connectivity.

Access to objects in zonal buckets

To get the performance benefits of a zonal bucket, make sure to open objects for streaming and maintain a stream as you perform operations on the objects. When you establish and maintain a stream, you can perform subsequent read or write operations to the object with very low latency. For example, when reading a Parquet file, you can perform both the initial read of the file's metadata (the footer) and the subsequent read of specific rows within a single request. This approach is more efficient than using separate requests for each step.

Once established, object streams are kept open by default when you access zonal bucket objects using Cloud Storage FUSE or the Cloud Storage client libraries.

You can open multiple read streams to an object from any number of hosts. There's no limitation on the number of read streams you can establish to an object.

Appending objects

You can append data to objects in zonal buckets. When you make appends to objects, the following semantics apply:

  • Appendable objects appear in the bucket namespace as soon as you start writing to them and can be read while still being written.

  • There are no restrictions on the number of appends you can make to an object or the number of bytes you can append at a time. You can make appends up until an object reaches its maximum size of 5 TiB.

  • An object's size will grow in real-time as new appends are permanently written or flushed. When you establish a read stream, you should anticipate a minimal delay in the object's size getting updated.

  • Appendable objects can only have one writer at a time. If a new write stream is established for an object that already has an existing write stream, an error is returned from Cloud Storage to the original stream, and the original stream will no longer be permitted to write. The new writer can resume appending from the last persisted offset without other interleaved appends to the object.

Finalizing objects

After an object is finalized, you can no longer append to it, but you can still overwrite the object with a new version. The metadata of a finalized object is still mutable; for example, new tags can be added and the object can be renamed.

Mounting zonal buckets

You can mount and access zonal buckets by using Cloud Storage FUSE or the Cloud Storage FUSE CSI driver. Make sure to use Cloud Storage FUSE version 3.7.2 or later. To use the Cloud Storage FUSE CSI driver, ensure that your Google Kubernetes Engine version is 1.35.0-gke.3047001 or later.

Pricing

Using Rapid Bucket incurs charges for data storage, operations, and networking. For more information, see Pricing.

Limitations

  • Zonal buckets must have hierarchical namespace and uniform bucket-level access enabled.

  • Google Cloud CLI limitations:

    • Visibility of incomplete uploads: Unlike buckets in other storage classes, where objects only appear in the namespace after an upload completes, partially uploaded objects in zonal buckets are immediately visible. If a Google Cloud CLI upload command fails or is interrupted, you might see incomplete objects in your bucket. You can still resume these uploads by re-running the command.

    • Object overwrites: Standard Google Cloud CLI behavior applies to zonal buckets: when you overwrite an object, if a file or object with the same name exists at the destination, the Google Cloud CLI cp, mv, and rsync commands will overwrite it by default. To prevent overwrites, use the --no-clobber flag. When using the Google Cloud CLI, appending data to an existing object is not supported; the entire source must be re-uploaded.

    • Object finalization: Objects uploaded to a zonal bucket by using the Google Cloud CLI might occasionally experience a brief delay before the object's metadata is fully synchronized. Because Cloud Storage uses an eventually consistent model, attempting to download an object immediately after upload can result in a hash mismatch error if the metadata is not yet updated.

      If a download fails with a hash mismatch error shortly after an upload, retry the command. The system ensures that downloads either succeed in full or fail explicitly; partial or corrupted downloads won't occur silently.

Incompatibilities

Zonal buckets are incompatible with the following tools, operations, and products:

  • Tools

    • XML API multipart uploads

    • Writes using the XML API or the JSON API

  • Writes for non-appendable objects by using gRPC

  • Data protection and disaster recovery

    • Object Versioning

    • Soft delete

  • Data management

    • Anywhere Cache

    • Autoclass

    • Bucket Lock

    • Composing objects

    • Object holds

    • The Object Lifecycle Management SetStorageClass action

    • The Object Lifecycle Management Delete action

    • Object Retention Lock

    • Pub/Sub notifications

    • Relocating buckets

    • Resumable uploads

    • Rewriting objects

    • Requester Pays

  • Access control

    • Object-level access control lists (ACLs)

    • CORS configurations

    • Customer-supplied encryption key (CSEK)

    • HMAC keys

Quotas

Each zone per project has a storage bytes quota. Each zone per project also has an egress quota from Cloud Storage to Google services. To see how much storage or data egress quota is available, refer to the Quotas & System Limits page. To learn how to request more quota, see Manage your quotas.

Best practices

To help optimize performance when using zonal buckets with Cloud Storage FUSE, maintain an open file handle to the mounted objects and use it for multiple operations. This results in better performance because it lets Cloud Storage FUSE avoid performing unnecessary network round trips per repeat read.

What's next