Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

多模態資料集

您可以在 Agent Platform 上建立、管理、共用及使用多模態資料集，用於生成式 AI。多模態資料集提供下列主要功能：

您可以從 BigQuery、DataFrame 或 Cloud Storage 中的 JSONL 檔案載入資料集。
只要建立一次資料集，就能用於不同類型的工作，例如監督式微調和批次預測，避免資料重複和格式問題。
將所有生成式 AI 資料集集中在一個受管理的位置。
驗證結構和結構定義，並量化下游工作所需的資源，協助您在開始工作前找出錯誤並估算成本。

您可以透過 Agent Platform SDK 或 REST API 使用多模態資料集。

多模態資料集是Agent Platform 上的受管理資料集。這類資料集與其他類型的受管理資料集不同，差異如下：

多模態資料集可包含任何模態的資料 (文字、圖片、音訊、影片)，其他類型的代管資料集則僅適用於單一模態。
多模態資料集只能用於 Agent Platform 的生成式 AI 服務，例如使用生成模型進行微調和批次預測。其他受管理資料集類型只能用於 Agent Platform 預測模型。
多模態資料集支援其他方法，例如 assemble 和 assess，可用於預覽資料、驗證要求及估算費用。
多模態資料集會儲存在 BigQuery 中，這個服務專為大型資料集進行最佳化。

事前準備

登入 Google Cloud 帳戶。如果您是 Google Cloud新手，歡迎建立帳戶，親自評估產品在實際工作環境中的成效。新客戶還能獲得價值 $300 美元的免費抵免額，可用於執行、測試及部署工作負載。

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Agent Platform, BigQuery, and Cloud Storage APIs.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the APIs

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Agent Platform, BigQuery, and Cloud Storage APIs.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the APIs

安裝並初始化 Agent Platform SDK for Python

匯入下列程式庫：

from google.cloud.aiplatform.preview import datasets

# To use related features, you may also need to import some of the following features:
from vertexai.preview.tuning import sft
from vertexai.batch_prediction import BatchPredictionJob

from vertexai.generative_models import Content, Part, Tool, ToolConfig, SafetySetting, GenerationConfig, FunctionDeclaration

建立資料集

您可以從不同來源建立多模態 dataset：

從 Pandas DataFrame

my_dataset = datasets.MultimodalDataset.from_pandas(
    dataframe=my_dataframe,
    target_table_id=table_id    # optional
)

來自 BigQuery DataFrame：

my_dataset = datasets.MultimodalDataset.from_bigframes(
    dataframe=my_dataframe,
    target_table_id=table_id    # optional
)

來自 BigQuery 資料表

my_dataset_from_bigquery = datasets.MultimodalDataset.from_bigquery(
    bigquery_uri=f"bq://projectId.datasetId.tableId"
)

使用 REST API 從 BigQuery 資料表匯出資料

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT/locations/LOCATION/datasets" \
-d '{
  "display_name": "TestDataset",
  "metadataSchemaUri": "gs://google-cloud-aiplatform/schema/dataset/metadata/multimodal_1.0.0.yaml",
  "metadata": {
    "inputConfig": {
      "bigquery_source": {
        "uri": "bq://projectId.datasetId.tableId"
      }
    }
  }
}'

從 Cloud Storage 中的 JSONL 檔案匯入。在下列範例中，JSONL 檔案包含已為 Gemini 格式化的要求，因此不需要組裝。
```
my_dataset = datasets.MultimodalDataset.from_gemini_request_jsonl(
  gcs_uri = gcs_uri_of_jsonl_file,
)
```

現有多模態資料集

# Get the most recently created dataset
first_dataset = datasets.MultimodalDataset.list()[0]

# Load dataset based on its name
same_dataset = datasets.MultimodalDataset(first_dataset.name)

建構及附加範本

範本會定義如何將多模態資料集轉換為可傳遞至模型的格式。這是執行微調或批次預測工作時的必要條件。

Agent Platform SDK

建構範本。建構範本的方法有兩種：

使用 construct_single_turn_template 輔助方法：

template_config = datasets.construct_single_turn_template(
        prompt="This is the image: {image_uris}",
        response="{labels}",
        system_instruction='You are a botanical image classifier. Analyze the provided image '
                'and determine the most accurate classification of the flower.'
                'These are the only flower categories: [\'daisy\', \'dandelion\', \'roses\', \'sunflowers\', \'tulips\'].'
                'Return only one category per image.'
)

從 GeminiExample 手動建構範本，可實現更精細的控制，例如多輪對話。下列程式碼範例也包含選用的註解程式碼，可指定 field_mapping，讓您使用與資料集欄名稱不同的預留位置名稱。例如：

# Define a GeminiExample
gemini_example = datasets.GeminiExample(
    contents=[
        Content(role="user", parts=[Part.from_text("This is the image: {image_uris}")]),
        Content(role="model", parts=[Part.from_text("This is the flower class: {label}.")]),
      Content(role="user", parts=[Part.from_text("Your response should only contain the class label.")]),
      Content(role="model", parts=[Part.from_text("{label}")]),

      # Optional: If you specify a field_mapping, you can use different placeholder values. For example:
      # Content(role="user", parts=[Part.from_text("This is the image: {uri_placeholder}")]),
      # Content(role="model", parts=[Part.from_text("This is the flower class: {flower_placeholder}.")]),
      # Content(role="user", parts=[Part.from_text("Your response should only contain the class label.")]),
      # Content(role="model", parts=[Part.from_text("{flower_placeholder}")]),
    ],
    system_instruction=Content(
        parts=[
            Part.from_text(
                'You are a botanical image classifier. Analyze the provided image '
                'and determine the most accurate classification of the flower.'
                'These are the only flower categories: [\'daisy\', \'dandelion\', \'roses\', \'sunflowers\', \'tulips\'].'
                'Return only one category per image.'
            )
        ]
    ),
)

# construct the template, specifying a map for the placeholder
template_config = datasets.GeminiTemplateConfig(
    gemini_example=gemini_example,

    # Optional: Map the template placeholders to the column names of your dataset.
    # Not required if the template placesholders are column names of the dataset.
    # field_mapping={"uri_placeholder": "image_uris", "flower_placeholder": "labels"},
)

將其附加至資料集：

my_dataset.attach_template_config(template_config=template_config)

REST

呼叫 patch 方法，並使用下列內容更新 metadata 欄位：

BigQuery 資料表的 URI。如果是從 BigQuery 資料表建立的資料集，這是您的來源 bigquery_uri。如果是從其他來源 (例如 JSONL 或 DataFrame) 建立的資料集，這是資料複製到的 BigQuery 資料表。
A gemini_template_config。

curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-d $'{
  "metadata": {
    "input_config": {
      "bigquery_source": {
        "uri": "bq://projectId.datasetId.tableId"
      }
    },
    "gemini_template_config_source": {
      "gemini_template_config": {
        "gemini_example": {
          "contents": [
            {
              "role": "user",
              "parts": [
                {
                  "text": "This is the image: {image_uris}"

                }
              ]
            },
            {
              "role": "model",
              "parts": [
                {
                  "text": "response"
                }
              ]
            }
          ]
        "systemInstruction": {
            "parts": [
                {
                    "text": "You are a botanical image classifier."
                }
            ]
          }
        }
      }
    }
  }
}' \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID?updateMask=metadata"

(選用) 組裝資料集

assemble 方法會套用範本來轉換資料集，並將輸出內容儲存在新的 BigQuery 資料表中。這樣就能在資料傳送至模型前預覽資料。

預設會使用資料集附加的 template_config，但您可以指定範本來覆寫預設行為。

Agent Platform SDK

table_id, assembly = my_dataset.assemble(template_config=template_config)

# Inspect the results
assembly.head()

REST

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID:assemble" \
-d '{}'

舉例來說，假設您的多模態資料集包含下列資料：

列	image_uris	標籤
1	gs://cloud-samples-data/ai-platform/flowers/daisy/1396526833_fb867165be_n.jpg	雛菊

接著，assemble 方法會建立名為 table_id 的新 BigQuery 資料表，其中每個資料列都包含要求主體。例如：

{
  "contents": [
    {
      "parts": [
        {
          "text": "This is the image: "
        },
        {
          "fileData": {
            "fileUri": "gs://cloud-samples-data/ai-platform/flowers/daisy/1396526833_fb867165be_n.jpg",
            "mimeType": "image/jpeg"
          }
        }
      ],
      "role": "user"
    },
    {
      "parts": [
        {
          "text": "daisy"
        }
      ],
      "role": "model"
    }
  ],
  "systemInstruction": {
    "parts": [
      {
        "text": "You are a botanical image classifier. Analyze the provided image and determine the most accurate classification of the flower.These are the only flower categories: ['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips'].Return only one category per image."
      }
    ]
  }
}

調整模型

您可以使用多模態資料集調整 Gemini 模型。

(選用) 驗證資料集

評估資料集，檢查是否含有錯誤，例如資料集格式錯誤或模型錯誤。

Agent Platform SDK

呼叫 assess_tuning_validity()。預設會使用資料集附加的 template_config，但您可以指定範本來覆寫預設行為。

# Attach template
my_dataset.attach_template_config(template_config=template_config)

# Validation for tuning
validation = my_dataset.assess_tuning_validity(
    model_name="gemini-2.5-flash",
    dataset_usage="SFT_TRAINING"
)

# Inspect validation result
validation.errors

REST

呼叫 assess 方法並提供 TuningValidationAssessmentConfig。

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID:assess" \
-d '{
  "tuningValidationAssessmentConfig": {
    "modelName": "projects/PROJECT_ID/locations/LOCATION/models/gemini-2.5-flash",
    "datasetUsage": "SFT_TRAINING"
  }
}'

(選用) 預估資源用量

評估資料集，取得微調工作的權杖和可計費字元數。

Agent Platform SDK

撥打 assess_tuning_resources()

# Resource estimation for tuning.
tuning_resources = my_dataset.assess_tuning_resources(
    model_name="gemini-2.5-flash"
)

print(tuning_resources)
# For example, TuningResourceUsageAssessmentResult(token_count=362688, billable_character_count=122000)

REST

呼叫 assess 方法並提供 TuningResourceUsageAssessmentConfig。

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID:assess" \
-d '{
  "tuningResourceUsageAssessmentConfig": {
    "modelName": "projects/PROJECT_ID/locations/LOCATION/models/gemini-2.5-flash"
  }
}'

執行調整工作

Agent Platform SDK

from vertexai.tuning import sft

sft_tuning_job = sft.train(
  source_model="gemini-2.5-flash",
  # Pass the Vertex Multimodal Datasets directly
  train_dataset=my_multimodal_dataset,
  validation_dataset=my_multimodal_validation_dataset,
)

Google Gen AI SDK

from google import genai
from google.genai.types import HttpOptions, CreateTuningJobConfig

client = genai.Client(http_options=HttpOptions(api_version="v1"))

tuning_job = client.tunings.tune(
  base_model="gemini-2.5-flash",
  # Pass the resource name of the Multimodal Dataset, not the dataset object
  training_dataset={
      "vertex_dataset_resource": my_multimodal_dataset.resource_name
  },
  # Optional
  config=CreateTuningJobConfig(
      tuned_model_display_name="Example tuning job"),
)

詳情請參閱「建立微調作業」。

批次預測

您可以使用多模態資料集取得批次預測。

(選用) 驗證資料集

評估資料集，檢查是否含有錯誤，例如資料集格式錯誤或模型錯誤。

Agent Platform SDK

呼叫 assess_batch_prediction_validity()。預設會使用資料集附加的 template_config，但您可以指定範本來覆寫預設行為。

# Attach template
my_dataset.attach_template_config(template_config=template_config)

# Validation for batch prediction
validation = my_dataset.assess_batch_prediction_validity(
    model_name="gemini-2.5-flash",
    dataset_usage="SFT_TRAINING"
)

# Inspect validation result
validation.errors

REST

呼叫 assess 方法並提供 batchPredictionValidationAssessmentConfig。

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID:assess" \
-d '{
  "batchPredictionValidationAssessmentConfig": {
    "modelName": "projects/PROJECT_ID/locations/LOCATION/models/gemini-2.5-flash",
  }
}'

(選用) 預估資源用量

評估資料集，取得工作的權杖數量。

Agent Platform SDK

撥打 assess_batch_prediction_resources()

batch_prediction_resources = my_dataset.assess_batch_prediction_resources(
    model_name="gemini-2.5-flash"
)

print(batch_prediction_resources)
# For example, BatchPredictionResourceUsageAssessmentResult(token_count=362688, audio_token_count=122000)

REST

呼叫 assess 方法並提供 batchPredictionResourceUsageAssessmentConfig。

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID:assess" \
-d '{
  "batchPredictionResourceUsageAssessmentConfig": {
    "modelName": "projects/PROJECT_ID/locations/LOCATION/models/gemini-2.5-flash"
  }
}'

執行批次預測工作

您可以傳遞組裝輸出內容的 BigQuery table_id，使用多模態資料集執行批次預測：

Agent Platform SDK

from vertexai.batch_prediction import BatchPredictionJob

# Dataset needs to have an attached template_config to batch prediction
my_dataset.attach_template_config(template_config=template_config)

# assemble dataset to get assembly table id
assembly_table_id, _ = my_dataset.assemble()

batch_prediction_job = BatchPredictionJob.submit(
    source_model="gemini-2.5-flash",
    input_dataset=assembly_table_id,
)

Google Gen AI SDK

from google import genai

client = genai.Client(http_options=HttpOptions(api_version="v1"))

# Attach template_config and assemble dataset
my_dataset.attach_template_config(template_config=template_config)
assembly_table_id, _ = my_dataset.assemble()

job = client.batches.create(
    model="gemini-2.5-flash",
    src=assembly_table_id,
)

詳情請參閱「要求批次預測工作」。

限制

多模態資料集只能搭配生成式 AI 功能使用，無法搭配非生成式 AI 功能使用，例如 AutoML 訓練和自訂訓練。
多模態資料集只能搭配 Gemini 等 Google 模型使用，無法搭配第三方模型使用。

定價

調整模型或執行批次預測工作時，系統會針對生成式 AI 使用量和在 BigQuery 中查詢資料集向您收費。

建立、組裝或評估多模態資料集時，系統會針對在 BigQuery 中儲存及查詢多模態資料集收取費用。具體來說，下列作業會使用這些基礎服務：

Create 個資料集
- 無論是從現有 BigQuery 資料表或 DataFrame 建立資料集，都不會產生額外的儲存空間費用。這是因為我們使用邏輯檢視畫面，而非儲存另一份資料副本。
- 從其他來源建立的資料集會將資料複製到新的 BigQuery 資料表，因此會產生 BigQuery 儲存空間費用。舉例來說，每月每 GiB 的有效邏輯儲存空間費用為 $0.02 美元。
Assemble 個資料集
- 這個方法會建立新的 BigQuery 資料表，其中包含模型要求格式的完整資料集，這會產生 BigQuery 的儲存空間費用。舉例來說，每月每 GiB 的有效邏輯儲存空間費用為 $0.02 美元。
- 這個方法也會讀取資料集一次，因此會在 BigQuery 中產生查詢費用。舉例來說，以量計價的運算費用為每 TiB $6.25 美元。
Assess 會讀取資料集一次，這會在 BigQuery 中產生查詢費用。舉例來說，以量計價的運算費用為每 TiB $6.25 美元。

使用 Pricing Calculator，根據您的預測使用量來產生預估費用。

多模態資料集 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

事前準備

建立資料集

建構及附加範本

Agent Platform SDK

REST

(選用) 組裝資料集

Agent Platform SDK

REST

調整模型

(選用) 驗證資料集

Agent Platform SDK

REST

(選用) 預估資源用量

Agent Platform SDK

REST

執行調整工作

Agent Platform SDK

Google Gen AI SDK

批次預測

(選用) 驗證資料集

Agent Platform SDK

REST

(選用) 預估資源用量

Agent Platform SDK

REST

執行批次預測工作

Agent Platform SDK

Google Gen AI SDK

限制

定價

多模態資料集