Upload multibagian XML API

Halaman ini membahas upload multibagian XML API di Cloud Storage. Metode upload ini mengupload file dalam beberapa bagian, lalu menyusunnya menjadi satu objek menggunakan permintaan akhir. Upload multibagian XML API kompatibel dengan upload multibagian Amazon S3.

Ringkasan

Upload multibagian XML API memungkinkan Anda mengupload data dalam beberapa bagian, lalu menyusunnya menjadi objek akhir. Perilaku ini memiliki beberapa keuntungan, terutama untuk file berukuran besar:

  • Anda dapat mengupload beberapa bagian secara bersamaan, sehingga mengurangi waktu yang diperlukan untuk mengupload data secara keseluruhan.

  • Jika salah satu operasi upload gagal, Anda hanya perlu mengupload ulang sebagian dari keseluruhan objek, tidak perlu memulai ulang dari awal.

  • Karena total ukuran file tidak ditentukan di awal, Anda dapat menggunakan upload multibagian XML API untuk upload streaming atau untuk mengompresi data dengan cepat saat mengupload.

Upload multibagian XML API memiliki tiga langkah yang diperlukan:

  1. Mulai upload menggunakan permintaan POST, yang mencakup penentuan metadata apa pun yang seharusnya dimiliki objek yang sudah selesai. Respons akan menampilkan UploadId yang Anda gunakan di semua permintaan berikutnya yang terkait dengan upload.

  2. Upload data menggunakan satu atau beberapa permintaan PUT.

  3. Selesaikan upload menggunakan permintaan POST. Permintaan ini akan menimpa objek apa pun yang ada di bucket dengan nama yang sama.

Tidak ada batasan durasi upload multibagian dan bagian yang diuploadnya tetap dalam status tidak selesai atau tidak ada aktivitas dalam bucket.

Pertimbangan

Batasan berikut berlaku untuk penggunaan upload multibagian XML API:

  • Ada batas ukuran minimum suatu bagian, ukuran maksimum suatu bagian, dan jumlah bagian yang digunakan untuk menyusun upload yang selesai.
  • Prasyarat tidak didukung dalam permintaan.
  • Hash MD5 tidak ada untuk objek yang diupload menggunakan metode ini.
  • Metode upload ini tidak didukung di konsol Google Cloud atau Google Cloud CLI.

Perhatikan hal berikut saat menggunakan upload multibagian XML API:

Cara library klien menggunakan upload multibagian XML API

Bagian ini memberikan informasi tentang cara melakukan upload multibagian XML API dengan library klien yang mendukungnya.

Library klien

Java

Untuk mengetahui informasi selengkapnya, lihatDokumentasi referensi Cloud Storage Java API.

Untuk melakukan autentikasi ke Cloud Storage, siapkan Kredensial Default Aplikasi. Untuk mengetahui informasi selengkapnya, lihat Menyiapkan autentikasi untuk library klien.

Contoh berikut memulai upload multibagian:


import com.google.cloud.storage.HttpStorageOptions;
import com.google.cloud.storage.MultipartUploadClient;
import com.google.cloud.storage.MultipartUploadSettings;
import com.google.cloud.storage.multipartupload.model.CreateMultipartUploadRequest;
import com.google.cloud.storage.multipartupload.model.CreateMultipartUploadResponse;
import java.util.HashMap;
import java.util.Map;

public class CreateMultipartUpload {
  public static void createMultipartUpload(String projectId, String bucketName, String objectName) {
    // The ID of your GCP project
    // String projectId = "your-project-id";

    // The ID of your GCS bucket
    // String sourceBucketName = "your-unique-bucket-name";

    // The ID of your GCS object
    // String sourceObjectName = "your-object-name";

    HttpStorageOptions storageOptions =
        HttpStorageOptions.newBuilder().setProjectId(projectId).build();
    MultipartUploadSettings mpuSettings = MultipartUploadSettings.of(storageOptions);
    MultipartUploadClient mpuClient = MultipartUploadClient.create(mpuSettings);

    System.out.println("Initiating multipart upload for " + objectName);

    Map<String, String> metadata = new HashMap<>();
    metadata.put("key1", "value1");
    String contentType = "text/plain";
    CreateMultipartUploadRequest createRequest =
        CreateMultipartUploadRequest.builder()
            .bucket(bucketName)
            .key(objectName)
            .metadata(metadata)
            .contentType(contentType)
            .build();

    CreateMultipartUploadResponse createResponse = mpuClient.createMultipartUpload(createRequest);
    String uploadId = createResponse.uploadId();
    System.out.println("Upload ID: " + uploadId);
  }
}

Contoh berikut menunjukkan proses upload bagian objek individual:


import com.google.cloud.storage.HttpStorageOptions;
import com.google.cloud.storage.MultipartUploadClient;
import com.google.cloud.storage.MultipartUploadSettings;
import com.google.cloud.storage.RequestBody;
import com.google.cloud.storage.multipartupload.model.UploadPartRequest;
import com.google.cloud.storage.multipartupload.model.UploadPartResponse;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.util.Random;

public class UploadPart {
  public static void uploadPart(
      String projectId, String bucketName, String objectName, String uploadId, int partNumber)
      throws IOException {
    // The ID of your GCP project
    // String projectId = "your-project-id";

    // The ID of your GCS bucket
    // String bucketName = "your-unique-bucket-name";

    // The ID of your GCS object
    // String objectName = "your-object-name";

    // The ID of the multipart upload
    // String uploadId = "your-upload-id";

    // The part number of the part being uploaded
    // int partNumber = 1;

    HttpStorageOptions storageOptions =
        HttpStorageOptions.newBuilder().setProjectId(projectId).build();
    MultipartUploadSettings mpuSettings = MultipartUploadSettings.of(storageOptions);
    MultipartUploadClient mpuClient = MultipartUploadClient.create(mpuSettings);

    // The minimum part size for a multipart upload is 5 MiB, except for the last part.
    byte[] bytes = new byte[5 * 1024 * 1024];
    new Random().nextBytes(bytes);
    RequestBody requestBody = RequestBody.of(ByteBuffer.wrap(bytes));

    System.out.println("Uploading part " + partNumber);
    UploadPartRequest uploadPartRequest =
        UploadPartRequest.builder()
            .bucket(bucketName)
            .key(objectName)
            .partNumber(partNumber)
            .uploadId(uploadId)
            .build();

    UploadPartResponse uploadPartResponse = mpuClient.uploadPart(uploadPartRequest, requestBody);

    System.out.println("Part " + partNumber + " uploaded with ETag: " + uploadPartResponse.eTag());
  }
}

Contoh berikut mencantumkan bagian objek:


import com.google.cloud.storage.HttpStorageOptions;
import com.google.cloud.storage.MultipartUploadClient;
import com.google.cloud.storage.MultipartUploadSettings;
import com.google.cloud.storage.multipartupload.model.ListPartsRequest;
import com.google.cloud.storage.multipartupload.model.ListPartsResponse;
import com.google.cloud.storage.multipartupload.model.Part;

public class ListParts {
  public static void listParts(
      String projectId, String bucketName, String objectName, String uploadId) {
    // The ID of your GCP project
    // String projectId = "your-project-id";

    // The ID of your GCS bucket
    // String bucketName = "your-unique-bucket-name";

    // The ID of your GCS object
    // String objectName = "your-object-name";

    // The ID of the multipart upload
    // String uploadId = "your-upload-id";

    HttpStorageOptions storageOptions =
        HttpStorageOptions.newBuilder().setProjectId(projectId).build();
    MultipartUploadSettings mpuSettings = MultipartUploadSettings.of(storageOptions);
    MultipartUploadClient mpuClient = MultipartUploadClient.create(mpuSettings);

    System.out.println("Listing parts for upload ID: " + uploadId);

    ListPartsRequest listPartsRequest =
        ListPartsRequest.builder().bucket(bucketName).key(objectName).uploadId(uploadId).build();

    ListPartsResponse listPartsResponse = mpuClient.listParts(listPartsRequest);

    if (listPartsResponse.parts() == null || listPartsResponse.parts().isEmpty()) {
      System.out.println("No parts have been uploaded yet.");
      return;
    }

    System.out.println("Uploaded Parts:");
    for (Part part : listPartsResponse.parts()) {
      System.out.println("  - Part Number: " + part.partNumber());
      System.out.println("    ETag: " + part.eTag());
      System.out.println("    Size: " + part.size() + " bytes");
      System.out.println("    Last Modified: " + part.lastModified());
    }
  }
}

Contoh berikut menyelesaikan upload multibagian:


import com.google.cloud.storage.HttpStorageOptions;
import com.google.cloud.storage.MultipartUploadClient;
import com.google.cloud.storage.MultipartUploadSettings;
import com.google.cloud.storage.multipartupload.model.CompleteMultipartUploadRequest;
import com.google.cloud.storage.multipartupload.model.CompleteMultipartUploadResponse;
import com.google.cloud.storage.multipartupload.model.CompletedMultipartUpload;
import com.google.cloud.storage.multipartupload.model.CompletedPart;
import java.util.List;

public class CompleteMultipartUpload {
  public static void completeMultipartUpload(
      String projectId,
      String bucketName,
      String objectName,
      String uploadId,
      List<CompletedPart> completedParts) {

    // The ID of your GCP project
    // String projectId = "your-project-id";

    // The ID of your GCS bucket
    // String bucketName = "your-unique-bucket-name";

    // The ID of your GCS object
    // String objectName = "your-object-name";

    // The ID of the multipart upload
    // String uploadId = "your-upload-id";

    // The list of completed parts from the UploadPart responses.
    // List<CompletedPart> completedParts = ...;

    HttpStorageOptions storageOptions =
        HttpStorageOptions.newBuilder().setProjectId(projectId).build();
    MultipartUploadSettings mpuSettings = MultipartUploadSettings.of(storageOptions);
    MultipartUploadClient mpuClient = MultipartUploadClient.create(mpuSettings);

    System.out.println("Completing multipart upload for " + objectName);

    CompletedMultipartUpload completedMultipartUpload =
        CompletedMultipartUpload.builder().parts(completedParts).build();

    CompleteMultipartUploadRequest completeRequest =
        CompleteMultipartUploadRequest.builder()
            .bucket(bucketName)
            .key(objectName)
            .uploadId(uploadId)
            .multipartUpload(completedMultipartUpload)
            .build();

    CompleteMultipartUploadResponse completeResponse =
        mpuClient.completeMultipartUpload(completeRequest);

    System.out.println(
        "Upload complete for "
            + completeResponse.key()
            + " in bucket "
            + completeResponse.bucket());
    System.out.println("Final ETag: " + completeResponse.etag());
  }
}

Node.js

Untuk mengetahui informasi selengkapnya, lihatDokumentasi referensi Cloud Storage Node.js API.

Untuk melakukan autentikasi ke Cloud Storage, siapkan Kredensial Default Aplikasi. Untuk mengetahui informasi selengkapnya, lihat Menyiapkan autentikasi untuk library klien.

Anda dapat melakukan upload multibagian XML API menggunakan metode uploadFileInChunks. Contoh:

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// The ID of your GCS bucket
// const bucketName = 'your-unique-bucket-name';

// The path of file to upload
// const filePath = 'path/to/your/file';

// The size of each chunk to be uploaded
// const chunkSize = 32 * 1024 * 1024;

// Imports the Google Cloud client library
const {Storage, TransferManager} = require('@google-cloud/storage');

// Creates a client
const storage = new Storage();

// Creates a transfer manager client
const transferManager = new TransferManager(storage.bucket(bucketName));

async function uploadFileInChunksWithTransferManager() {
  // Uploads the files
  await transferManager.uploadFileInChunks(filePath, {
    chunkSizeBytes: chunkSize,
  });

  console.log(`${filePath} uploaded to ${bucketName}.`);
}

uploadFileInChunksWithTransferManager().catch(console.error);

Python

Untuk mengetahui informasi selengkapnya, lihatDokumentasi referensi Cloud Storage Python API.

Untuk melakukan autentikasi ke Cloud Storage, siapkan Kredensial Default Aplikasi. Untuk mengetahui informasi selengkapnya, lihat Menyiapkan autentikasi untuk library klien.

Anda dapat melakukan upload multibagian XML API menggunakan metode upload_chunks_concurrently. Contoh:

def upload_chunks_concurrently(
    bucket_name,
    source_filename,
    destination_blob_name,
    chunk_size=32 * 1024 * 1024,
    workers=8,
):
    """Upload a single file, in chunks, concurrently in a process pool."""
    # The ID of your GCS bucket
    # bucket_name = "your-bucket-name"

    # The path to your file to upload
    # source_filename = "local/path/to/file"

    # The ID of your GCS object
    # destination_blob_name = "storage-object-name"

    # The size of each chunk. The performance impact of this value depends on
    # the use case. The remote service has a minimum of 5 MiB and a maximum of
    # 5 GiB.
    # chunk_size = 32 * 1024 * 1024 (32 MiB)

    # The maximum number of processes to use for the operation. The performance
    # impact of this value depends on the use case. Each additional process
    # occupies some CPU and memory resources until finished. Threads can be used
    # instead of processes by passing `worker_type=transfer_manager.THREAD`.
    # workers=8

    from google.cloud.storage import Client, transfer_manager

    storage_client = Client()
    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)

    transfer_manager.upload_chunks_concurrently(
        source_filename, blob, chunk_size=chunk_size, max_workers=workers
    )

    print(f"File {source_filename} uploaded to {destination_blob_name}.")


if __name__ == "__main__":
    argparse = argparse.ArgumentParser(
        description="Upload a file to GCS in chunks concurrently."
    )
    argparse.add_argument(
        "--bucket_name", help="The name of the GCS bucket to upload to."
    )
    argparse.add_argument(
        "--source_filename", help="The local path to the file to upload."
    )
    argparse.add_argument(
        "--destination_blob_name", help="The name of the object in GCS."
    )
    argparse.add_argument(
        "--chunk_size",
        type=int,
        default=32 * 1024 * 1024,
        help="The size of each chunk in bytes (default: 32 MiB). The remote\
              service has a minimum of 5 MiB and a maximum of 5 GiB",
    )
    argparse.add_argument(
        "--workers",
        type=int,
        default=8,
        help="The number of worker processes to use (default: 8).",
    )
    args = argparse.parse_args()
    upload_chunks_concurrently(
        args.bucket_name,
        args.source_filename,
        args.destination_blob_name,
        args.chunk_size,
        args.workers,
    )

Langkah berikutnya