Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

パブリックエンドポイントを作成する

gcloud CLI または Gemini Enterprise API を使用してモデルをデプロイするには、まずパブリックエンドポイントを作成する必要があります。

既存のパブリックエンドポイントがある場合は、この手順をスキップして、 gcloud CLI または Gemini Enterprise API を使用してモデルをデプロイするに進んでください。

このドキュメントでは、新しいパブリックエンドポイントを作成するプロセスについて説明します。

専用パブリックエンドポイントを作成する（推奨）

専用パブリックエンドポイントのデフォルトのリクエストタイムアウトは 10 分です。 Gemini Enterprise API と Agent Platform SDK for Python では、次の例に示すように、新しい inferenceTimeout 値を含む clientConnectionConfig オブジェクトを追加して、別のリクエストタイムアウトを指定できます。タイムアウトの最大値は 3,600 秒（1 時間）です。

Google Cloud コンソール

コンソールの [Agent Platform] セクションで、 [オンライン予測] ページに移動します。 Google Cloud
[オンライン予測] ページに移動
[ 作成] をクリックします。
[新しいエンドポイント] ペインで、次の操作を行います。

[エンドポイント名] を入力します。
アクセスタイプの [標準] を選択します。
[専用 DNS を有効にする] チェックボックスをオンにします。
[続行] をクリックします。

[完了] をクリックします。

REST

リクエストのデータを使用する前に、次のように置き換えます。

LOCATION_ID: 使用するリージョン。
PROJECT_ID: [プロジェクト ID](/resource-manager/docs/creating-managing-projects#identifiers)。。
ENDPOINT_NAME: エンドポイントの表示名。
INFERENCE_TIMEOUT_SECS: （省略可）省略可能な inferenceTimeout フィールドの秒数。

HTTP メソッドと URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints

リクエストの本文（JSON）:

{
  "display_name": "ENDPOINT_NAME",
  "dedicatedEndpointEnabled": true,
  "clientConnectionConfig": {
    "inferenceTimeout": {
      "seconds": INFERENCE_TIMEOUT_SECS
    }
  }
}

リクエストを送信するには、次のいずれかのオプションを展開します。

curl（Linux、macOS、Cloud Shell）

注: 次のコマンドは、gcloud init または gcloud auth login を実行して、ユーザーアカウントで gcloud CLI にログインしているか、Cloud Shell を使用して自動的に gcloud CLI にログインしていることを前提としています。gcloud auth list を実行すると、現在アクティブなアカウントを確認できます。

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints"

PowerShell（Windows）

注: 次のコマンドは、gcloud init または gcloud auth login を実行して、ご自分のユーザーアカウントで gcloud CLI にログインしていることを前提としています。gcloud auth list を実行すると、現在アクティブなアカウントを確認できます。

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints" | Select-Object -Expand Content

次のような JSON レスポンスが返されます。

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints/ENDPOINT_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateEndpointOperationMetadata",
    "genericMetadata": {
      "createTime": "2020-11-05T17:45:42.812656Z",
      "updateTime": "2020-11-05T17:45:42.812656Z"
    }
  }
}

レスポンスに

"done":
true

が表示されるまで、オペレーションのステータスをポーリングできます。

Python

このサンプルを試す前に、クライアントライブラリを使用したPython Agent Platform クイックスタートの手順に沿って設定を行ってください。

Agent Platform で認証を行うには、アプリケーションのデフォルト認証情報を設定します。詳細については、ローカル開発環境の認証の設定をご覧ください。

次のように置き換えます。

PROJECT_ID: プロジェクト ID。
LOCATION_ID: Agent Platform を使用しているリージョン。
ENDPOINT_NAME: エンドポイントの表示名。
INFERENCE_TIMEOUT_SECS: （省略可）省略可能な inference_timeout 値の秒数。

from google.cloud import aiplatform

PROJECT_ID = "PROJECT_ID"
LOCATION = "LOCATION_ID"
ENDPOINT_NAME = "ENDPOINT_NAME"
INFERENCE_TIMEOUT_SECS = "INFERENCE_TIMEOUT_SECS"

aiplatform.init(
    project=PROJECT_ID,
    location=LOCATION,
    api_endpoint=ENDPOINT_NAME,
)

dedicated_endpoint = aiplatform.Endpoint.create(
    display_name=DISPLAY_NAME,
    dedicated_endpoint_enabled=True,
    sync=True,
    inference_timeout=INFERENCE_TIMEOUT_SECS,
)

推論のタイムアウト構成

推論リクエストのデフォルトのタイムアウト時間は 600 秒（10 分）です。このタイムアウトは、エンドポイントの作成時に明示的な推論タイムアウトが指定されていない場合に適用されます。最大許容タイムアウト値は 1 時間です。

エンドポイントの作成時に推論タイムアウトを構成するには、次のコードスニペットに示すように、inference_timeout パラメータを使用します。

timeout_endpoint = aiplatform.Endpoint.create(
    display_name="dedicated-endpoint-with-timeout",
    dedicated_endpoint_enabled=True,
    inference_timeout=1800,  # Unit: Seconds
)

エンドポイントの作成後に推論タイムアウト設定を変更するには、EndpointService.UpdateEndpointLongRunning メソッドを使用します。EndpointService.UpdateEndpoint メソッドはこの変更をサポートしていません。

リクエスト / レスポンスロギング

リクエスト / レスポンスロギング機能は、API インタラクションをキャプチャします。ただし、BigQuery の制限に準拠するため、サイズが 10 MB を超えるペイロードはログから除外されます。

エンドポイントの作成時にリクエスト / レスポンスロギングを有効にして構成するには、次のコードスニペットに示すように、次のパラメータを使用します。

logging_endpoint = aiplatform.Endpoint.create(
    display_name="dedicated-endpoint-with-logging",
    dedicated_endpoint_enabled=True,
    enable_request_response_logging=True,
    request_response_logging_sampling_rate=1.0,  # Default: 0.0
    request_response_logging_bq_destination_table="bq://test_logging",
    # If not set, a new BigQuery table will be created with the name:
    # bq://{project_id}.logging_{endpoint_display_name}_{endpoint_id}.request_response_logging
)

エンドポイントの作成後にリクエスト / レスポンスロギング設定を変更するには、EndpointService.UpdateEndpointLongRunning メソッドを使用します。EndpointService.UpdateEndpoint メソッドはこの変更をサポートしていません。

共有パブリックエンドポイントを作成する

Google Cloud コンソール

コンソールの [Agent Platform] セクションで、 [オンライン予測] ページに移動します。 Google Cloud
[オンライン予測] ページに移動
[ 作成] をクリックします。
[新しいエンドポイント] ペインで、次の操作を行います。

[エンドポイント名] を入力します。
アクセスタイプの [標準] を選択します。
[続行] をクリックします。

[完了] をクリックします。

gcloud

次の例では、gcloud ai endpoints create コマンドを使用します。

gcloud ai endpoints create \
    --region=LOCATION_ID \
    --display-name=ENDPOINT_NAME

次のように置き換えます。

LOCATION_ID: Agent Platform を使用しているリージョン。
ENDPOINT_NAME: エンドポイントの表示名。

Google Cloud CLI ツールがエンドポイントを作成するまでに数秒かかる場合があります。

REST

リクエストのデータを使用する前に、次のように置き換えます。

LOCATION_ID: 使用するリージョン。
PROJECT_ID: [プロジェクト ID](/resource-manager/docs/creating-managing-projects#identifiers)。。
ENDPOINT_NAME: エンドポイントの表示名。

HTTP メソッドと URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints

リクエストの本文（JSON）:

{
  "display_name": "ENDPOINT_NAME"
}

リクエストを送信するには、次のいずれかのオプションを展開します。

curl（Linux、macOS、Cloud Shell）

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints"

PowerShell（Windows）

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints" | Select-Object -Expand Content

次のような JSON レスポンスが返されます。

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints/ENDPOINT_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateEndpointOperationMetadata",
    "genericMetadata": {
      "createTime": "2020-11-05T17:45:42.812656Z",
      "updateTime": "2020-11-05T17:45:42.812656Z"
    }
  }
}

レスポンスに

"done":
true

が含まれるまで、オペレーションのステータスをポーリングできます。

# Endpoint name must be unique for the project resource "random_id" "endpoint_id" { byte_length = 4 } resource "google_vertex_ai_endpoint" "default" { name = substr(random_id.endpoint_id.dec, 0, 10) display_name = "sample-endpoint" description = "A sample Vertex AI endpoint" location = "us-central1" labels = { label-one = "value-one" } }

import com.google.api.gax.longrunning.OperationFuture; import com.google.cloud.aiplatform.v1.CreateEndpointOperationMetadata; import com.google.cloud.aiplatform.v1.Endpoint; import com.google.cloud.aiplatform.v1.EndpointServiceClient; import com.google.cloud.aiplatform.v1.EndpointServiceSettings; import com.google.cloud.aiplatform.v1.LocationName; import java.io.IOException; import java.util.concurrent.ExecutionException; import java.util.concurrent.TimeUnit; import java.util.concurrent.TimeoutException; public class CreateEndpointSample { public static void main(String[] args) throws IOException, InterruptedException, ExecutionException, TimeoutException { // TODO(developer): Replace these variables before running the sample. String project = "YOUR_PROJECT_ID"; String endpointDisplayName = "YOUR_ENDPOINT_DISPLAY_NAME"; createEndpointSample(project, endpointDisplayName); } static void createEndpointSample(String project, String endpointDisplayName) throws IOException, InterruptedException, ExecutionException, TimeoutException { EndpointServiceSettings endpointServiceSettings = EndpointServiceSettings.newBuilder() .setEndpoint("us-central1-aiplatform.googleapis.com:443") .build(); // Initialize client that will be used to send requests. This client only needs to be created // once, and can be reused for multiple requests. After completing all of your requests, call // the "close" method on the client to safely clean up any remaining background resources. try (EndpointServiceClient endpointServiceClient = EndpointServiceClient.create(endpointServiceSettings)) { String location = "us-central1"; LocationName locationName = LocationName.of(project, location); Endpoint endpoint = Endpoint.newBuilder().setDisplayName(endpointDisplayName).build(); OperationFuture<Endpoint, CreateEndpointOperationMetadata> endpointFuture = endpointServiceClient.createEndpointAsync(locationName, endpoint); System.out.format("Operation name: %s\n", endpointFuture.getInitialFuture().get().getName()); System.out.println("Waiting for operation to finish..."); Endpoint endpointResponse = endpointFuture.get(300, TimeUnit.SECONDS); System.out.println("Create Endpoint Response"); System.out.format("Name: %s\n", endpointResponse.getName()); System.out.format("Display Name: %s\n", endpointResponse.getDisplayName()); System.out.format("Description: %s\n", endpointResponse.getDescription()); System.out.format("Labels: %s\n", endpointResponse.getLabelsMap()); System.out.format("Create Time: %s\n", endpointResponse.getCreateTime()); System.out.format("Update Time: %s\n", endpointResponse.getUpdateTime()); } } }

/** * TODO(developer): Uncomment these variables before running the sample.\ * (Not necessary if passing values as arguments) */ // const endpointDisplayName = 'YOUR_ENDPOINT_DISPLAY_NAME'; // const project = 'YOUR_PROJECT_ID'; // const location = 'YOUR_PROJECT_LOCATION'; // Imports the Google Cloud Endpoint Service Client library const {EndpointServiceClient} = require('@google-cloud/aiplatform'); // Specifies the location of the api endpoint const clientOptions = { apiEndpoint: 'us-central1-aiplatform.googleapis.com', }; // Instantiates a client const endpointServiceClient = new EndpointServiceClient(clientOptions); async function createEndpoint() { // Configure the parent resource const parent = `projects/${project}/locations/${location}`; const endpoint = { displayName: endpointDisplayName, }; const request = { parent, endpoint, }; // Get and print out a list of all the endpoints for this resource const [response] = await endpointServiceClient.createEndpoint(request); console.log(`Long running operation : ${response.name}`); // Wait for operation to complete await response.promise(); const result = response.result; console.log('Create endpoint response'); console.log(`\tName : ${result.name}`); console.log(`\tDisplay name : ${result.displayName}`); console.log(`\tDescription : ${result.description}`); console.log(`\tLabels : ${JSON.stringify(result.labels)}`); console.log(`\tCreate time : ${JSON.stringify(result.createTime)}`); console.log(`\tUpdate time : ${JSON.stringify(result.updateTime)}`); } createEndpoint();

def create_endpoint_sample( project: str, display_name: str, location: str, ): aiplatform.init(project=project, location=location) endpoint = aiplatform.Endpoint.create( display_name=display_name, project=project, location=location, ) print(endpoint.display_name) print(endpoint.resource_name) return endpoint

次のステップ

特に記載のない限り、このページのコンテンツはクリエイティブ・コモンズの表示 4.0 ライセンスにより使用許諾されます。コードサンプルは Apache 2.0 ライセンスにより使用許諾されます。詳しくは、Google Developers サイトのポリシーをご覧ください。Java は Oracle および関連会社の登録商標です。

最終更新日 2026-06-06 UTC。

パブリックエンドポイントを作成する

専用パブリックエンドポイントを作成する（推奨）

Google Cloud コンソール

REST

curl（Linux、macOS、Cloud Shell）

PowerShell（Windows）

Python

推論のタイムアウト構成

リクエスト / レスポンスロギング

共有パブリックエンドポイントを作成する

Google Cloud コンソール

gcloud

REST

curl（Linux、macOS、Cloud Shell）

PowerShell（Windows）

Terraform

Java

Node.js

Python

次のステップ

パブリック エンドポイントを作成する コレクションでコンテンツを整理 必要に応じて、コンテンツの保存と分類を行います。

専用パブリック エンドポイントを作成する（推奨）

Google Cloud コンソール

REST

curl（Linux、macOS、Cloud Shell）

PowerShell（Windows）

Python

推論のタイムアウト構成

リクエスト / レスポンス ロギング

共有パブリック エンドポイントを作成する

Google Cloud コンソール

gcloud

REST

curl（Linux、macOS、Cloud Shell）

PowerShell（Windows）

Terraform

Java

Node.js

Python

次のステップ

パブリックエンドポイントを作成する

専用パブリックエンドポイントを作成する（推奨）

リクエスト / レスポンスロギング

共有パブリックエンドポイントを作成する