Chat Completions API로 Gemini 호출

다음 샘플은 비스트리밍 요청을 보내는 방법을 보여줍니다.

REST

  curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
  https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/endpoints/openapi/chat/completions \
  -d '{
    "model": "google/${MODEL_ID}",
    "messages": [{
      "role": "user",
      "content": "Write a story about a magic backpack."
    }]
  }'
  

Python

이 샘플을 사용해 보기 전에 Python 설정 안내를 따르세요. Agent Platform 빠른 시작: 클라이언트 라이브러리 사용. 자세한 내용은 Agent Platform Python API 참조 문서를 참조하세요.

Agent Platform에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

from google.auth import default
import google.auth.transport.requests

import openai

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "us-central1"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
    base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
    api_key=credentials.token,
)

response = client.chat.completions.create(
    model="google/gemini-2.0-flash-001",
    messages=[{"role": "user", "content": "Why is the sky blue?"}],
)

print(response)

다음 샘플은 Chat Completions API를 사용하여 Gemini 모델에 스트리밍 요청을 보내는 방법을 보여줍니다.

REST

  curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
  https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/endpoints/openapi/chat/completions \
  -d '{
    "model": "google/${MODEL_ID}",
    "stream": true,
    "messages": [{
      "role": "user",
      "content": "Write a story about a magic backpack."
    }]
  }'
  

Python

이 샘플을 사용해 보기 전에 Python 설정 안내를 따르세요. Agent Platform 빠른 시작: 클라이언트 라이브러리 사용. 자세한 내용은 Agent Platform Python API 참조 문서를 참조하세요.

Agent Platform에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

from google.auth import default
import google.auth.transport.requests

import openai

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "us-central1"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
    base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
    api_key=credentials.token,
)

response = client.chat.completions.create(
    model="google/gemini-2.0-flash-001",
    messages=[{"role": "user", "content": "Why is the sky blue?"}],
    stream=True,
)
for chunk in response:
    print(chunk)

Gemini Enterprise Agent Platform의 Gemini API에 프롬프트 및 이미지 보내기

Python

이 샘플을 사용해 보기 전에 Python 설정 안내를 따르세요. Agent Platform 빠른 시작: 클라이언트 라이브러리 사용. 자세한 내용은 Agent Platform Python API 참조 문서를 참조하세요.

Agent Platform에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.


from google.auth import default
import google.auth.transport.requests

import openai

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "us-central1"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
    base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
    api_key=credentials.token,
)

response = client.chat.completions.create(
    model="google/gemini-2.0-flash-001",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe the following image:"},
                {
                    "type": "image_url",
                    "image_url": "gs://cloud-samples-data/generative-ai/image/scones.jpg",
                },
            ],
        }
    ],
)

print(response)

Chat Completions API로 자체 배포된 모델 호출

다음 샘플은 비스트리밍 요청을 보내는 방법을 보여줍니다.

REST

  curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
  https://aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/global/endpoints/${ENDPOINT}/chat/completions \
  -d '{
    "messages": [{
      "role": "user",
      "content": "Write a story about a magic backpack."
    }]
  }'

Python

이 샘플을 사용해 보기 전에 Python 설정 안내를 따르세요. Agent Platform 빠른 시작: 클라이언트 라이브러리 사용. 자세한 내용은 Agent Platform Python API 참조 문서를 참조하세요.

Agent Platform에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

from google.auth import default
import google.auth.transport.requests

import openai

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "us-central1"
# model_id = "gemma-2-9b-it"
# endpoint_id = "YOUR_ENDPOINT_ID"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
    base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/{endpoint_id}",
    api_key=credentials.token,
)

response = client.chat.completions.create(
    model=model_id,
    messages=[{"role": "user", "content": "Why is the sky blue?"}],
)
print(response)

다음 샘플은 Chat Completions API를 사용하여 자체 배포된 모델에 스트리밍 요청을 보내는 방법을 보여줍니다.

REST

    curl -X POST \
      -H "Authorization: Bearer $(gcloud auth print-access-token)" \
      -H "Content-Type: application/json" \
    https://aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/global/endpoints/${ENDPOINT}/chat/completions \
    -d '{
      "stream": true,
      "messages": [{
        "role": "user",
        "content": "Write a story about a magic backpack."
      }]
    }'
  

Python

이 샘플을 사용해 보기 전에 Python 설정 안내를 따르세요. Agent Platform 빠른 시작: 클라이언트 라이브러리 사용. 자세한 내용은 Agent Platform Python API 참조 문서를 참조하세요.

Agent Platform에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

from google.auth import default
import google.auth.transport.requests

import openai

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "us-central1"
# model_id = "gemma-2-9b-it"
# endpoint_id = "YOUR_ENDPOINT_ID"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
    base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/{endpoint_id}",
    api_key=credentials.token,
)

response = client.chat.completions.create(
    model=model_id,
    messages=[{"role": "user", "content": "Why is the sky blue?"}],
    stream=True,
)
for chunk in response:
    print(chunk)

extra_body 예시

SDK 또는 REST API를 사용하여 extra_body를 전달할 수 있습니다.

thought_tag_marker 추가

{
  ...,
  "extra_body": {
     "google": {
       ...,
       "thought_tag_marker": "..."
     }
   }
}

SDK를 사용하여 extra_body 추가

client.chat.completions.create(
  ...,
  extra_body = {
    'extra_body': { 'google': { ... } }
  },
)

extra_content 예시

REST API를 직접 사용하여 이 필드를 채울 수 있습니다.

content 문자열이 포함된 extra_content

{
  "messages": [
    { "role": "...", "content": "...", "extra_content": { "google": { ... } } }
  ]
}

메시지당 extra_content

{
  "messages": [
    {
      "role": "...",
      "content": [
        { "type": "...", ..., "extra_content": { "google": { ... } } }
      ]
    }
}

도구별 호출 extra_content

{
  "messages": [
    {
      "role": "...",
      "tool_calls": [
        {
          ...,
          "extra_content": { "google": { ... } }
        }
      ]
    }
  ]
}

샘플 curl 요청

SDK를 거치지 않고 이러한 curl 요청을 직접 사용할 수 있습니다.

extra_body와 함께 thinking_config 사용

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/endpoints/openapi/chat/completions \
  -d '{ \
    "model": "google/gemini-2.5-flash-preview-04-17", \
    "messages": [ \
      { "role": "user", \
      "content": [ \
        { "type": "text", \
          "text": "Are there any primes number of the form n*ceil(log(n))" \
        }] }], \
    "extra_body": { \
      "google": { \
          "thinking_config": { \
          "include_thoughts": true, "thinking_budget": 10000 \
        }, \
        "thought_tag_marker": "think" } }, \
    "stream": true }'

stream_function_call_arguments 사용

요청 예시:

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/global/endpoints/openapi/chat/completions \
  -d '{
  "model": "google/gemini-3-pro-preview", \
  "messages": [ \
    { "role": "user", "content": "What is the weather like in Boston and New Delhi today?" } ], \
  "tools": [ \
    { \
      "type": "function", \
      "function": { \
        "name": "get_current_weather", \
        "description": "Get the current weather in a given location", \
        "parameters": { \
          "type": "object", \
          "properties": { \
            "location": { \
              "type": "string", \
              "description": "The city and state, e.g. San Francisco, CA" \
            }, \
            "unit": { \
              "type": "string", \
              "enum": [ \
                "celsius", \
                "fahrenheit" \
              ] \
            } \
          }, \
          "required": [ \
            "location", \
            "unit" \
          ] \
        } \
      } \
    } \
  ], \
  "extra_body": { \
    "google": { \
      "stream_function_call_arguments": true \
    } \
  }, \
  "stream": true \
}'

응답 예시:

data: {"choices":[{"delta":{"role":"assistant","tool_calls":[{"extra_content":{"google":{"thought_signature":"..."}},"function":{"arguments":"","name":"get_current_weather"},"id":"function-call-c855348a-459a-46a4-a8ad-aa0a4e7c3563","index":1,"type":"function"}]},"index":0,"logprobs":null}],"created":1770850461,"id":"nQiNafGyF5rw998PstqooAY","model":"google/gemini-3-pro-preview","object":"chat.completion.chunk","system_fingerprint":""}

data: {"choices":[{"delta":{"role":"assistant","tool_calls":[{"function":{"arguments":"{\"location\":\"Boston, MA","name":"get_current_weather"},"id":"function-call-c855348a-459a-46a4-a8ad-aa0a4e7c3563","index":0,"type":"function"}]},"index":0,"logprobs":null}],"created":1770850461,"id":"nQiNafGyF5rw998PstqooAY","model":"google/gemini-3-pro-preview","object":"chat.completion.chunk","system_fingerprint":""}

data: {"choices":[{"delta":{"role":"assistant","tool_calls":[{"function":{"arguments":"\"","name":"get_current_weather"},"id":"function-call-c855348a-459a-46a4-a8ad-aa0a4e7c3563","index":0,"type":"function"}]},"index":0,"logprobs":null}],"created":1770850461,"id":"nQiNafGyF5rw998PstqooAY","model":"google/gemini-3-pro-preview","object":"chat.completion.chunk","system_fingerprint":""}

data: {"choices":[{"delta":{"role":"assistant","tool_calls":[{"function":{"arguments":",\"unit\":\"celsius","name":"get_current_weather"},"id":"function-call-c855348a-459a-46a4-a8ad-aa0a4e7c3563","index":0,"type":"function"}]},"index":0,"logprobs":null}],"created":1770850461,"id":"nQiNafGyF5rw998PstqooAY","model":"google/gemini-3-pro-preview","object":"chat.completion.chunk","system_fingerprint":""}

data: {"choices":[{"delta":{"role":"assistant","tool_calls":[{"function":{"arguments":"\"","name":"get_current_weather"},"id":"function-call-c855348a-459a-46a4-a8ad-aa0a4e7c3563","index":0,"type":"function"}]},"index":0,"logprobs":null}],"created":1770850461,"id":"nQiNafGyF5rw998PstqooAY","model":"google/gemini-3-pro-preview","object":"chat.completion.chunk","system_fingerprint":""}

data: {"choices":[{"delta":{"role":"assistant","tool_calls":[{"function":{"arguments":"}","name":"get_current_weather"},"id":"function-call-c855348a-459a-46a4-a8ad-aa0a4e7c3563","index":0,"type":"function"}]},"index":0,"logprobs":null}],"created":1770850461,"id":"nQiNafGyF5rw998PstqooAY","model":"google/gemini-3-pro-preview","object":"chat.completion.chunk","system_fingerprint":""}

data: {"choices":[{"delta":{"role":"assistant","tool_calls":[{"function":{"arguments":"","name":"get_current_weather"},"id":"function-call-df0d087c-ad74-46f1-ba4a-9353cbf288a8","index":0,"type":"function"}]},"index":0,"logprobs":null}],"created":1770850461,"id":"nQiNafGyF5rw998PstqooAY","model":"google/gemini-3-pro-preview","object":"chat.completion.chunk","system_fingerprint":""}

data: {"choices":[{"delta":{"role":"assistant","tool_calls":[{"function":{"arguments":"{\"location\":\"New Delhi, India","name":"get_current_weather"},"id":"function-call-df0d087c-ad74-46f1-ba4a-9353cbf288a8","index":1,"type":"function"}]},"index":0,"logprobs":null}],"created":1770850461,"id":"nQiNafGyF5rw998PstqooAY","model":"google/gemini-3-pro-preview","object":"chat.completion.chunk","system_fingerprint":""}

data: {"choices":[{"delta":{"role":"assistant","tool_calls":[{"function":{"arguments":"\"","name":"get_current_weather"},"id":"function-call-df0d087c-ad74-46f1-ba4a-9353cbf288a8","index":1,"type":"function"}]},"index":0,"logprobs":null}],"created":1770850461,"id":"nQiNafGyF5rw998PstqooAY","model":"google/gemini-3-pro-preview","object":"chat.completion.chunk","system_fingerprint":""}

data: {"choices":[{"delta":{"role":"assistant","tool_calls":[{"function":{"arguments":",\"unit\":\"celsius","name":"get_current_weather"},"id":"function-call-df0d087c-ad74-46f1-ba4a-9353cbf288a8","index":1,"type":"function"}]},"index":0,"logprobs":null}],"created":1770850461,"id":"nQiNafGyF5rw998PstqooAY","model":"google/gemini-3-pro-preview","object":"chat.completion.chunk","system_fingerprint":""}

data: {"choices":[{"delta":{"role":"assistant","tool_calls":[{"function":{"arguments":"\"","name":"get_current_weather"},"id":"function-call-df0d087c-ad74-46f1-ba4a-9353cbf288a8","index":1,"type":"function"}]},"index":0,"logprobs":null}],"created":1770850461,"id":"nQiNafGyF5rw998PstqooAY","model":"google/gemini-3-pro-preview","object":"chat.completion.chunk","system_fingerprint":""}

data: {"choices":[{"delta":{"role":"assistant","tool_calls":[{"function":{"arguments":"}","name":"get_current_weather"},"id":"function-call-df0d087c-ad74-46f1-ba4a-9353cbf288a8","index":1,"type":"function"}]},"finish_reason":"tool_calls","index":0,"logprobs":null}],"created":1770850461,"id":"nQiNafGyF5rw998PstqooAY","model":"google/gemini-3-pro-preview","object":"chat.completion.chunk","system_fingerprint":"","usage":{"completion_tokens":45,"completion_tokens_details":{"reasoning_tokens":504},"extra_properties":{"google":{"traffic_type":"PROVISIONED_THROUGHPUT"}},"prompt_tokens":27,"total_tokens":576}}

data: [DONE]

이미지 생성

OpenAI 응답 형식과 호환되도록 응답의 audio 필드는 결과의 MIME 유형을 나타내는 extra_content.google.mime_type으로 명시적으로 채워집니다.

요청 예시:

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/global/endpoints/openapi/chat/completions \
  -d '{"model":"google/gemini-3-pro-image-preview", "messages":[{ "role": "user", "content": "Generate an image of a cat." }], "modalities": ["image"] }'

응답 예:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "audio": {
          "data": "<BASE64_BYTES>",
          "extra_content": {
            "google": {
              "mime_type": "image/png"
            }
          }
        },
        "content": null,
        "extra_content": {
          "google": {
            "thought_signature": "..."
          }
        },
        "role": "assistant"
      }
    }
  ],
  "created": 1770850692,
  "id": "hAmNaZb8BZOX4_UPlNXoEA",
  "model": "google/gemini-3-pro-image-preview",
  "object": "chat.completion",
  "system_fingerprint": "",
  "usage": {
    "completion_tokens": 1120,
    "completion_tokens_details": {
      "reasoning_tokens": 251
    },
    "extra_properties": {
      "google": {
        "traffic_type": "PROVISIONED_THROUGHPUT"
      }
    },
    "prompt_tokens": 7,
    "total_tokens": 1378
  }
}

멀티모달 요청

Chat Completions API는 오디오와 동영상가 모두 포함된 다양한 멀티모달 입력을 지원합니다.

image_url을 사용하여 이미지 데이터 전달

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/endpoints/openapi/chat/completions \
  -d '{ \
    "model": "google/gemini-2.0-flash-001", \
    "messages": [{ "role": "user", "content": [ \
      { "type": "text", "text": "Describe this image" }, \
      { "type": "image_url", "image_url": "gs://cloud-samples-data/generative-ai/image/scones.jpg" }] }] }'

input_audio를 사용하여 오디오 데이터 전달

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/endpoints/openapi/chat/completions \
  -d '{ \
    "model": "google/gemini-2.0-flash-001", \
    "messages": [ \
      { "role": "user", \
        "content": [ \
          { "type": "text", "text": "Describe this: " }, \
          { "type": "input_audio", "input_audio": { \
            "format": "audio/mp3", \
            "data": "gs://cloud-samples-data/generative-ai/audio/pixel.mp3" } }] }] }'

멀티모달 함수 응답

요청 예시:

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/global/endpoints/openapi/chat/completions \
  -d '{ \
    "model": "google/gemini-3-pro-preview", \
    "messages": [ \
      { "role": "user", "content": "Show me the green shirt I ordered last month." }, \
      { \
        "role": "assistant", \
        "tool_calls": [ \
          { \
            "extra_content": { \
              "google": { \
                "thought_signature": "<THOUGHT_SIGNATURE>" \
              } \
            }, \
            "function": { \
              "arguments": "{\"item_name\":\"green shirt\"}", \
              "name": "get_image" \
            }, \
            "id": "function-call-a350228d-0283-4792-8bfa-40da064fb959", \
            "type": "function" \
          } \
        ] \
      }, \
      { \
        "role": "tool", \
        "tool_call_id": "function-call-a350228d-0283-4792-8bfa-40da064fb959", \
        "content": "{\"image_ref\":{\"$ref\":\"dress.jpg\"}}", \
        "extra_content": { \
          "google": { \
            "parts": [ \
              { \
                "file_data": { \
                  "mime_type": "image/jpg", \
                  "display_name": "dress.jpg", \
                  "file_uri": "gs://cloud-samples-data/generative-ai/image/dress.jpg" \
                } \
              } \
            ] \
          } \
        } \
      } \
    ], \
    "tools": [ \
      { \
        "type": "function", \
        "function": { \
          "name": "get_image", \
          "description": "Retrieves the image file reference for a specific order item.", \
          "parameters": { \
            "type": "object", \
            "properties": { \
              "item_name": { \
                "type": "string", \
                "description": "The name or description of the item ordered (e.g., 'green shirt')." \
              } \
            }, \
            "required": [ \
              "item_name" \
            ] \
          } \
        } \
      } \
    ] \
  }'

응답 예:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "Here is the image of the green shirt you ordered.",
        "role": "assistant"
      }
    }
  ],
  "created": 1770852204,
  "id": "bA-NacCPKoae_9MPsNCn6Qc",
  "model": "google/gemini-3-pro-preview",
  "object": "chat.completion",
  "system_fingerprint": "",
  "usage": {
    "completion_tokens": 16,
    "extra_properties": {
      "google": {
        "traffic_type": "ON_DEMAND"
      }
    },
    "prompt_tokens": 1139,
    "total_tokens": 1155
  }
}

구조화된 출력

response_format 파라미터를 사용하여 정형 출력을 가져올 수 있습니다.

SDK 사용 예시

from pydantic import BaseModel
from openai import OpenAI

client = OpenAI()

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

completion = client.beta.chat.completions.parse(
    model="google/gemini-2.5-flash-preview-04-17",
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ],
    response_format=CalendarEvent,
)

print(completion.choices[0].message.parsed)

OpenAI 호환 모드에서 전역 엔드포인트 사용

다음 샘플은 OpenAI 호환 모드에서 전역 엔드포인트를 사용하는 방법을 보여줍니다.

REST

  curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/global/endpoints/openapi/chat/completions\
  -d '{ \
    "model": "google/gemini-2.0-flash-001", \
    "messages": [ \
    {"role": "user", \
      "content": "Hello World" \
      }] \
      }'
  

다음 단계