Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

マルチモーダルデータセット

Agent Platform のマルチモーダルデータセットを使用すると、生成 AI 用のマルチモーダルデータセットを作成、管理、共有、使用できます。マルチモーダルデータセットには、次の主な機能があります。

データセットは BigQuery、DataFrame、または Cloud Storage 内の JSONL ファイルから読み込むことができます。
データセットを一度作成して、教師ありファインチューニングやバッチ予測など、さまざまなジョブタイプで使用します。これにより、データの重複や形式の問題を防ぐことができます。
すべての生成 AI データセットを 1 つのマネージドロケーションに保存します。
スキーマと構造を検証し、ダウンストリームタスクに必要なリソースを定量化することで、タスクを開始する前にエラーを検出し、費用を見積もることができます。

Agent Platform SDK または REST API を使って、マルチモーダルデータセットを利用できます。

マルチモーダルデータセットは、Agent Platform のマネージドデータセットの一種です。他のタイプのマネージドデータセットとは次の点で異なります。

マルチモーダルデータセットには、任意のモダリティ（テキスト、画像、音声、動画）のデータを含めることができます。他のタイプのマネージドデータセットは、単一のモダリティ専用です。
マルチモーダルデータセットは、Agent Platform の生成 AI サービス（生成モデルを使用したチューニングやバッチ予測など）でのみ使用できます。他のマネージドデータセットタイプは、Agent Platform 予測モデルでのみ使用できます。
マルチモーダルデータセットは、データのプレビュー、リクエストの検証、費用の見積もりに使用される assemble や assess などの追加メソッドをサポートしています。
マルチモーダルデータセットは、大規模なデータセット用に最適化された BigQuery に保存されます。

始める前に

アカウントにログインします。 Google Cloud を初めて使用する場合は、アカウントを作成して、実際のシナリオで Google プロダクトのパフォーマンスを評価してください。 Google Cloud新規のお客様には、ワークロードの実行、テスト、デプロイができる無料クレジット $300 分を差し上げます。

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Agent Platform, BigQuery, and Cloud Storage APIs.

Roles required to enable APIs

To enable APIs, you need the serviceusage.services.enable permission. If you created the project, then you likely already have this permission through the Owner role (roles/owner). Otherwise, you can get this permission through the Service Usage Admin role (roles/serviceusage.serviceUsageAdmin). Learn how to grant roles.

Enable the APIs

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Agent Platform, BigQuery, and Cloud Storage APIs.

Roles required to enable APIs

Enable the APIs

Agent Platform SDK for Python をインストールして初期化する

次のライブラリをインポートしてクライアントを作成します。

import agentplatform
from agentplatform.types import (
    GeminiExample,
    GeminiRequestReadConfig,
    GeminiTemplateConfig,
)

# To use related features, such as tuning and batch prediction, you may also
# need to import the Google Gen AI SDK:
from google import genai
from google.genai.types import Content, Part

# Create a client for multimodal dataset operations.
client = agentplatform.Client(project="PROJECT_ID", location="LOCATION")

データセットを作成する

さまざまなソースからマルチモーダル dataset を作成できます。

Pandas DataFrame から

my_dataset = client.datasets.create_from_pandas(
    dataframe=my_dataframe,
    target_table_id=table_id    # optional
)

BigQuery DataFrame から:

my_dataset = client.datasets.create_from_bigframes(
    dataframe=my_dataframe,
    target_table_id=table_id    # optional
)

BigQuery テーブルから

my_dataset_from_bigquery = client.datasets.create_from_bigquery(
    bigquery_uri="bq://projectId.datasetId.tableId"
)

BigQuery テーブルから REST API を使用して

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT/locations/LOCATION/datasets" \
-d '{
  "display_name": "TestDataset",
  "metadataSchemaUri": "gs://google-cloud-aiplatform/schema/dataset/metadata/multimodal_1.0.0.yaml",
  "metadata": {
    "inputConfig": {
      "bigquery_source": {
        "uri": "bq://projectId.datasetId.tableId"
      }
    }
  }
}'

Cloud Storage の JSONL ファイルから読み込みます。次の例では、JSONL ファイルに Gemini 用にすでにフォーマットされたリクエストが含まれているため、アセンブリは必要ありません。
```
my_dataset = client.datasets.create_from_gemini_request_jsonl(
  gcs_uri = gcs_uri_of_jsonl_file,
)
```

既存のマルチモーダルデータセットから

# Load dataset based on its name. This accepts a full resource name or a
# dataset ID.
same_dataset = client.datasets.get_multimodal_dataset(name=dataset_name)

読み取り構成を作成して適用する

読み取り構成（GeminiRequestReadConfig）は、マルチモーダルデータセットをモデルに渡すことができる形式に変換する方法を定義します。アセンブリ時に対応するデータセット列の値に置き換えられるプレースホルダを含むテンプレートが含まれています。これは、チューニングジョブまたはバッチ予測ジョブの実行に必要です。

Agent Platform SDK

読み取り構成を作成します。作成方法は 2 つあります。

GeminiRequestReadConfig.single_turn_template ヘルパーメソッドを使用します。

read_config = GeminiRequestReadConfig.single_turn_template(
        prompt="This is the image: {image_uris}",
        response="{labels}",
        system_instruction='You are a botanical image classifier. Analyze the provided image '
                'and determine the most accurate classification of the flower.'
                'These are the only flower categories: [\'daisy\', \'dandelion\', \'roses\', \'sunflowers\', \'tulips\'].'
                'Return only one category per image.'
)

GeminiExample から読み取り構成を手動で作成します。これにより、マルチターンの会話など、より細かい粒度が可能になります。次のコードサンプルには、field_mapping を指定するためのコメントアウトされたコードも含まれています。これにより、データセットの列名とは異なるプレースホルダ名を使用できます。次に例を示します。

# Define a GeminiExample
gemini_example = GeminiExample(
  contents=[
      Content(role="user", parts=[Part.from_text(text="This is the image: {image_uris}")]),
      Content(role="model", parts=[Part.from_text(text="This is the flower class: {label}.")]),
      Content(role="user", parts=[Part.from_text(text="Your response should only contain the class label.")]),
      Content(role="model", parts=[Part.from_text(text="{label}")]),

      # Optional: If you specify a field_mapping, you can use different placeholder values. For example:
      # Content(role="user", parts=[Part.from_text(text="This is the image: {uri_placeholder}")]),
      # Content(role="model", parts=[Part.from_text(text="This is the flower class: {flower_placeholder}.")]),
      # Content(role="user", parts=[Part.from_text(text="Your response should only contain the class label.")]),
      # Content(role="model", parts=[Part.from_text(text="{flower_placeholder}")]),
  ],
  system_instruction=Content(
      parts=[
          Part.from_text(
              text='You are a botanical image classifier. Analyze the provided image '
              'and determine the most accurate classification of the flower.'
              'These are the only flower categories: [\'daisy\', \'dandelion\', \'roses\', \'sunflowers\', \'tulips\'].'
              'Return only one category per image.'
          )
      ]
  ),
)

# Construct the read config, specifying a map for the placeholders.
read_config = GeminiRequestReadConfig(
    template_config=GeminiTemplateConfig(
        gemini_example=gemini_example,

        # Optional: Map the template placeholders to the column names of your dataset.
        # Not required if the template placeholders are column names of the dataset.
        # field_mapping={"uri_placeholder": "image_uris", "flower_placeholder": "labels"},
    ),
)

データセットに接続して変更を永続化します。

my_dataset.set_read_config(read_config=read_config)
my_dataset = client.datasets.update_multimodal_dataset(multimodal_dataset=my_dataset)

REST

patch メソッドを呼び出し、metadata フィールドを次のように更新します。

BigQuery テーブルの URI。BigQuery テーブルから作成されたデータセットの場合、これはソース bigquery_uri です。JSONL や DataFrame などの他のソースから作成されたデータセットの場合、これはデータがコピーされた BigQuery テーブルです。
gemini_template_config。

curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-d $'{
  "metadata": {
    "input_config": {
      "bigquery_source": {
        "uri": "bq://projectId.datasetId.tableId"
      }
    },
    "gemini_template_config_source": {
      "gemini_template_config": {
        "gemini_example": {
          "contents": [
            {
              "role": "user",
              "parts": [
                {
                  "text": "This is the image: {image_uris}"

                }
              ]
            },
            {
              "role": "model",
              "parts": [
                {
                  "text": "response"
                }
              ]
            }
          ]
        "systemInstruction": {
            "parts": [
                {
                    "text": "You are a botanical image classifier."
                }
            ]
          }
        }
      }
    }
  }
}' \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID?updateMask=metadata"

（省略可）データセットを組み立てる

assemble メソッドは、読み取り構成を適用してデータセットを変換し、出力を新しい BigQuery テーブルに保存します。これにより、モデルに渡される前にデータをプレビューできます。

デフォルトでは、データセットに添付された読み取り構成が使用されますが、gemini_request_read_config を渡してデフォルトの動作をオーバーライドできます。

Agent Platform SDK

assemble メソッドは (table_id, dataframe) タプルを返します。組み立てられたテーブルを検査用に DataFrame として読み込むには、load_dataframe=True を渡します。

table_id, assembly = client.datasets.assemble(
    name=my_dataset.name,
    gemini_request_read_config=read_config,    # optional if attached to the dataset
    load_dataframe=True,
)

# Inspect the results
assembly.head()

REST

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID:assemble" \
-d '{}'

たとえば、マルチモーダルデータセットに次のデータが含まれているとします。

行	image_uris	ラベル
1	gs://cloud-samples-data/ai-platform/flowers/daisy/1396526833_fb867165be_n.jpg	デイジー

次に、assemble メソッドは、各行にリクエスト本文を含む table_id という名前の新しい BigQuery テーブルを作成します。次に例を示します。

{
  "contents": [
    {
      "parts": [
        {
          "text": "This is the image: "
        },
        {
          "fileData": {
            "fileUri": "gs://cloud-samples-data/ai-platform/flowers/daisy/1396526833_fb867165be_n.jpg",
            "mimeType": "image/jpeg"
          }
        }
      ],
      "role": "user"
    },
    {
      "parts": [
        {
          "text": "daisy"
        }
      ],
      "role": "model"
    }
  ],
  "systemInstruction": {
    "parts": [
      {
        "text": "You are a botanical image classifier. Analyze the provided image and determine the most accurate classification of the flower.These are the only flower categories: ['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips'].Return only one category per image."
      }
    ]
  }
}

モデルをチューニングする

マルチモーダルデータセットを使用して Gemini モデルをチューニングできます。

（省略可）データセットを検証する

データセットを評価して、データセットの形式エラーやモデルエラーなどのエラーが含まれているかどうかを確認します。

Agent Platform SDK

assess_tuning_validity() を呼び出します。デフォルトでは、データセットに添付された読み取り構成が使用されますが、gemini_request_read_config を渡してデフォルトの動作をオーバーライドできます。

# Attach the read configuration to the dataset.
my_dataset.set_read_config(read_config=read_config)
my_dataset = client.datasets.update_multimodal_dataset(multimodal_dataset=my_dataset)

# Validation for tuning
validation = client.datasets.assess_tuning_validity(
    dataset_name=my_dataset.name,
    model_name="gemini-2.5-flash",
    dataset_usage="SFT_TRAINING"
)

# Inspect validation result
validation.errors

REST

assess メソッドを呼び出し、TuningValidationAssessmentConfig を指定します。

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID:assess" \
-d '{
  "tuningValidationAssessmentConfig": {
    "modelName": "projects/PROJECT_ID/locations/LOCATION/models/gemini-2.5-flash",
    "datasetUsage": "SFT_TRAINING"
  }
}'

（省略可）リソース使用量を推定する

データセットを評価して、チューニングジョブのトークン数と課金対象文字数を取得します。

Agent Platform SDK

assess_tuning_resources() を呼び出します。

# Resource estimation for tuning.
tuning_resources = client.datasets.assess_tuning_resources(
    dataset_name=my_dataset.name,
    model_name="gemini-2.5-flash"
)

print(tuning_resources)
# For example, TuningResourceUsageAssessmentResult(token_count=362688, billable_character_count=122000)

REST

assess メソッドを呼び出し、TuningResourceUsageAssessmentConfig を指定します。

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID:assess" \
-d '{
  "tuningResourceUsageAssessmentConfig": {
    "modelName": "projects/PROJECT_ID/locations/LOCATION/models/gemini-2.5-flash"
  }
}'

チューニングジョブを実行する

Google Gen AI SDK を使用してチューニングジョブを開始し、マルチモーダルデータセットのリソース名を渡します。データセットには、読み取り構成が添付されている必要があります。

Google Gen AI SDK

from google import genai
from google.genai.types import HttpOptions, CreateTuningJobConfig

genai_client = genai.Client(http_options=HttpOptions(api_version="v1"))

tuning_job = genai_client.tunings.tune(
  base_model="gemini-2.5-flash",
  # Pass the resource name of the Multimodal Dataset, not the dataset object
  training_dataset={
      "vertex_dataset_resource": my_multimodal_dataset.name
  },
  # Optional
  config=CreateTuningJobConfig(
      validation_dataset={
          "vertex_dataset_resource": my_multimodal_validation_dataset.name
      },
      tuned_model_display_name="Example tuning job"),
)

詳細については、チューニングジョブを作成するをご覧ください。

バッチ予測

マルチモーダルデータセットを使用してバッチ予測を取得できます。

（省略可）データセットを検証する

データセットを評価して、データセットの形式エラーやモデルエラーなどのエラーが含まれているかどうかを確認します。

Agent Platform SDK

assess_batch_prediction_validity() を呼び出します。デフォルトでは、データセットに添付された読み取り構成が使用されますが、gemini_request_read_config を渡してデフォルトの動作をオーバーライドできます。

# Attach the read configuration to the dataset.
my_dataset.set_read_config(read_config=read_config)
my_dataset = client.datasets.update_multimodal_dataset(multimodal_dataset=my_dataset)

# Validation for batch prediction
validation = client.datasets.assess_batch_prediction_validity(
    dataset_name=my_dataset.name,
    model_name="gemini-2.5-flash"
)

# Inspect validation result
validation.errors

REST

assess メソッドを呼び出し、batchPredictionValidationAssessmentConfig を指定します。

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID:assess" \
-d '{
  "batchPredictionValidationAssessmentConfig": {
    "modelName": "projects/PROJECT_ID/locations/LOCATION/models/gemini-2.5-flash",
  }
}'

（省略可）リソース使用量を推定する

データセットを評価して、ジョブのトークン数を取得します。

Agent Platform SDK

assess_batch_prediction_resources() を呼び出します。

batch_prediction_resources = client.datasets.assess_batch_prediction_resources(
    dataset_name=my_dataset.name,
    model_name="gemini-2.5-flash"
)

print(batch_prediction_resources)
# For example, BatchPredictionResourceUsageAssessmentResult(token_count=362688, audio_token_count=122000)

REST

assess メソッドを呼び出し、batchPredictionResourceUsageAssessmentConfig を指定します。

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/datasets/DATASET_ID:assess" \
-d '{
  "batchPredictionResourceUsageAssessmentConfig": {
    "modelName": "projects/PROJECT_ID/locations/LOCATION/models/gemini-2.5-flash"
  }
}'

バッチ予測ジョブを実行する

マルチモーダルデータセットを使用してバッチ予測を行うには、組み立てられた出力の BigQuery table_id を渡します。

Google Gen AI SDK

from google import genai
from google.genai.types import HttpOptions

# Attach the read configuration to the dataset.
my_dataset.set_read_config(read_config=read_config)
my_dataset = client.datasets.update_multimodal_dataset(multimodal_dataset=my_dataset)

# Assemble the dataset to get the assembled BigQuery table.
table_id, _ = client.datasets.assemble(name=my_dataset.name)

genai_client = genai.Client(http_options=HttpOptions(api_version="v1"))

job = genai_client.batches.create(
    model="gemini-2.5-flash",
    src=f"bq://{table_id}",
)

詳細については、バッチ予測ジョブをリクエストするをご覧ください。

制限事項

マルチモーダルデータセットは、生成 AI 機能でのみ使用できます。AutoML トレーニングやカスタムトレーニングなどの非生成 AI 機能では使用できません。
マルチモーダルデータセットは、Gemini などの Google モデルでのみ使用できます。サードパーティのモデルでは使用できません。

料金

モデルをチューニングしたり、バッチ予測ジョブを実行したりすると、生成 AI の使用量と BigQuery でのデータセットのクエリに対して課金されます。

マルチモーダルデータセットの作成、組み立て、評価を行うと、BigQuery でのマルチモーダルデータセットの保存とクエリに対して課金されます。具体的には、次のオペレーションはこれらの基盤となるサービスを使用します。

Create 個のデータセット
- 既存の BigQuery テーブルまたは DataFrame から作成されたデータセットには、追加のストレージ費用は発生しません。これは、データの別のコピーを保存するのではなく、論理ビューを使用するためです。
- 他のソースから作成されたデータセットは、データを新しい BigQuery テーブルにコピーするため、BigQuery でストレージ費用が発生します。たとえば、アクティブな論理ストレージの場合、1 GiB あたり月額 $0.02 です。
Assemble 個のデータセット
- このメソッドは、モデルリクエスト形式の完全なデータセットを含む新しい BigQuery テーブルを作成します。これにより、BigQuery でストレージ費用が発生します。たとえば、アクティブな論理ストレージの場合、1 GiB あたり月額 $0.02 です。
- このメソッドはデータセットを 1 回読み取るため、BigQuery でクエリ費用が発生します。たとえば、オンデマンドコンピューティングの料金は TiB あたり $6.25 です。
Assess はデータセットを 1 回読み取るため、BigQuery でクエリ費用が発生します。たとえば、オンデマンドコンピューティングの料金は TiB あたり $6.25 です。

料金計算ツールを使うと、予想使用量に基づいて費用の見積もりを出すことができます。

マルチモーダル データセット コレクションでコンテンツを整理 必要に応じて、コンテンツの保存と分類を行います。

始める前に

データセットを作成する

読み取り構成を作成して適用する

Agent Platform SDK

REST

（省略可）データセットを組み立てる

Agent Platform SDK

REST

モデルをチューニングする

（省略可）データセットを検証する

Agent Platform SDK

REST

（省略可）リソース使用量を推定する

Agent Platform SDK

REST

チューニング ジョブを実行する

Google Gen AI SDK

バッチ予測

（省略可）データセットを検証する

Agent Platform SDK

REST

（省略可）リソース使用量を推定する

Agent Platform SDK

REST

バッチ予測ジョブを実行する

Google Gen AI SDK

制限事項

料金

マルチモーダルデータセット

チューニングジョブを実行する