OpenAI 호환성

Gemini 모델은 REST API와 함께 OpenAI 라이브러리(Python 및 TypeScript/JavaScript)를 사용하여 액세스할 수 있습니다. Vertex AI에서 OpenAI 라이브러리를 사용하려면 Google Cloud Auth만 지원됩니다. 아직 OpenAI 라이브러리를 사용하고 있지 않다면 Gemini API를 직접 호출하는 것이 좋습니다.

Python

import openai
from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "global"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token
)

response = client.chat.completions.create(
  model="google/gemini-2.0-flash-001",
  messages=[
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain to me how AI works"}
  ]
)

print(response.choices[0].message)

변경사항

api_key=credentials.token: Google Cloud 인증을 사용하려면 샘플 코드를 사용하여Google Cloud 인증 토큰을 가져옵니다.
base_url: OpenAI 라이브러리에 기본 URL 대신 Google Cloud에 요청을 전송하도록 지시합니다.
model="google/gemini-2.0-flash-001": Vertex에서 호스팅하는 모델 중 호환되는 Gemini 모델을 선택합니다.

사고

Gemini 2.5 모델은 복잡한 문제를 해결하도록 학습되어 추론 능력이 크게 향상되었습니다. Gemini API에는 모델이 얼마나 사고할지 세부적으로 제어할 수 있는 '사고 예산' 파라미터가 제공됩니다.

Gemini API와 달리 OpenAI API는 '낮음', '중간', '높음'의 세 가지 사고 제어 수준을 제공하며, 이들은 백그라운드에서 1,000, 8,000, 24,000 사고 토큰 예산에 매핑됩니다.

추론 노력을 전혀 지정하지 않는 것은 사고 예산을 지정하지 않는 것과 같습니다.

OpenAI 호환 API에서 사고 예산 및 기타 사고 관련 구성을 보다 직접적으로 제어하려면 extra_body.google.thinking_config를 활용하세요.

Python

import openai
from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "global"

# # Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token
)

response = client.chat.completions.create(
  model="google/gemini-2.5-flash",
  reasoning_effort="low",
  messages=[
      {"role": "system", "content": "You are a helpful assistant."},
      {
          "role": "user",
          "content": "Explain to me how AI works"
      }
  ]
)
print(response.choices[0].message)

스트리밍

Gemini API는 스트리밍 응답을 지원합니다.

Python

import openai
from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "global"

credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

client = openai.OpenAI(
  base_url=f"https://aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token
)
response = client.chat.completions.create(
  model="google/gemini-2.0-flash",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  stream=True
)

for chunk in response:
  print(chunk.choices[0].delta)

함수 호출

함수 호출을 사용하면 생성형 모델에서 구조화된 데이터 출력을 더 쉽게 가져올 수 있는데, 이는 Gemini API에서 지원됩니다.

Python

import openai
from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "global"

credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

client = openai.OpenAI(
  base_url=f"https://aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token
)

tools = [
  {
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Get the weather in a given location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "The city and state, e.g. Chicago, IL",
          },
          "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
        },
        "required": ["location"],
      },
    }
  }
]

messages = [{"role": "user", "content": "What's the weather like in Chicago today?"}]
response = client.chat.completions.create(
  model="google/gemini-2.0-flash",
  messages=messages,
  tools=tools,
  tool_choice="auto"
)

print(response)

이미지 이해

Gemini 모델은 네이티브 멀티모달이며 다양한 일반적인 비전 작업에서 동급 최고의 성능을 제공합니다.

Python

from google.auth import default
import google.auth.transport.requests

import base64
from openai import OpenAI

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "global"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token,
)

# Function to encode the image
def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')

# Getting the base64 string
# base64_image = encode_image("Path/to/image.jpeg")

response = client.chat.completions.create(
  model="google/gemini-2.0-flash",
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is in this image?",
        },
        {
          "type": "image_url",
          "image_url": {
            "url":  f"data:image/jpeg;base64,{base64_image}"
          },
        },
      ],
    }
  ],
)

print(response.choices[0])

이미지 생성

Python

from google.auth import default
import google.auth.transport.requests

import base64
from openai import OpenAI

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "global"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token,
)

# Function to encode the image
def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')

# Getting the base64 string
base64_image = encode_image("/content/wayfairsofa.jpg")

response = client.chat.completions.create(
  model="google/gemini-2.0-flash",
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is in this image?",
        },
        {
          "type": "image_url",
          "image_url": {
            "url":  f"data:image/jpeg;base64,{base64_image}"
          },
        },
      ],
    }
  ],
)

print(response.choices[0])

오디오 이해

오디오 입력 분석:

Python

from google.auth import default
import google.auth.transport.requests

import base64
from openai import OpenAI

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "global"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token,
)

with open("/path/to/your/audio/file.wav", "rb") as audio_file:
base64_audio = base64.b64encode(audio_file.read()).decode('utf-8')

response = client.chat.completions.create(
  model="gemini-2.0-flash",
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Transcribe this audio",
        },
        {
              "type": "input_audio",
              "input_audio": {
                "data": base64_audio,
                "format": "wav"
          }
        }
      ],
    }
  ],
)

print(response.choices[0].message.content)

구조화된 출력

Gemini 모델은 내가 정의한 구조로 JSON 객체를 출력할 수 있습니다.

Python

from google.auth import default
import google.auth.transport.requests

from pydantic import BaseModel
from openai import OpenAI

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "global"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token,
)

class CalendarEvent(BaseModel):
  name: str
  date: str
  participants: list[str]

completion = client.beta.chat.completions.parse(
  model="google/gemini-2.0-flash",
  messages=[
      {"role": "system", "content": "Extract the event information."},
      {"role": "user", "content": "John and Susan are going to an AI conference on Friday."},
  ],
  response_format=CalendarEvent,
)

print(completion.choices[0].message.parsed)

현재 제한사항

액세스 토큰은 기본적으로 1시간 동안 유효합니다. 만료 후에는 다시 인증해야 합니다. 자세한 내용은 이 코드 예시를 참고하세요.
기능 지원을 확대하고는 있지만, OpenAI 라이브러리 지원은 아직 프리뷰 버전입니다. 질문이나 문제가 있으면 Google Cloud 커뮤니티에 게시하세요.

다음 단계

Google 생성형 AI 라이브러리를 사용하여 Gemini의 잠재력을 활용하기
OpenAI 호환 문법으로 Chat Completions API를 사용하는 추가 예시 참고하기
개요 페이지에서 지원되는 Gemini 모델 및 파라미터 확인하기

OpenAI 호환성 컬렉션을 사용해 정리하기 내 환경설정을 기준으로 콘텐츠를 저장하고 분류하세요.

Python

사고

Python

스트리밍

Python

함수 호출

Python

이미지 이해

Python

이미지 생성

Python

오디오 이해

Python

구조화된 출력

Python

현재 제한사항

다음 단계

OpenAI 호환성