To use the OpenAI Python libraries with Vertex AI, you need to authenticate using Google credentials and configure your client to use a Vertex AI endpoint. This document shows you how to set up your environment and authenticate using two different methods.
Before you begin
Prerequisites
1. Install SDKs
Install the OpenAI and Google Auth SDKs:
pip install openai google-auth requests
2. Set up authentication
To authenticate to Vertex AI, set up Application Default Credentials (ADC). For more information, see Set up Application Default Credentials.
3. Identify your endpoint
Your endpoint depends on the type of model you are calling:
- Gemini models: Use
openapi
as the endpoint ID. - Self-deployed models: Certain models in Model Garden and supported Hugging Face models need to be deployed first before they can serve requests. When calling these models, you must specify the unique endpoint ID of your deployment.
Authentication methods
You can authenticate by either configuring the client object directly in your code or by setting environment variables. The following workflow outlines the process.
Choose the method that best suits your use case.
Method | Advantages | Disadvantages | Best For |
---|---|---|---|
Client Setup | Programmatic and flexible. Allows for dynamic credential management within the application. | Requires more code to manage credentials and endpoints directly. | Applications that need to manage multiple clients or refresh credentials dynamically. |
Environment Variables | Simple setup that separates configuration from code. Ideal for local development and testing. | Less secure if not managed properly. Less flexible for dynamic credential changes. | Quickstarts, local development, and containerized deployments where environment variables are standard. |
Client setup
You can programmatically get Google credentials and configure the OpenAI client in your Python code. By default, access tokens expire after one hour. For long-running applications, see how to refresh your credentials.
View Python code for client setup
import openai from google.auth import default import google.auth.transport.requests # TODO(developer): Update and un-comment below lines # project_id = "PROJECT_ID" # location = "us-central1" # Programmatically get an access token credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"]) credentials.refresh(google.auth.transport.requests.Request()) # Note: the credential lives for 1 hour by default (https://cloud.google.com/docs/authentication/token-types#at-lifetime); after expiration, it must be refreshed. ############################## # Choose one of the following: ############################## # If you are calling a Gemini model, set the ENDPOINT_ID variable to use openapi. ENDPOINT_ID = "openapi" # If you are calling a self-deployed model from Model Garden, set the # ENDPOINT_ID variable and set the client's base URL to use your endpoint. # ENDPOINT_ID = "YOUR_ENDPOINT_ID" # OpenAI Client client = openai.OpenAI( base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/{ENDPOINT_ID}", api_key=credentials.token, )
Environment variables
You can use the Google Cloud CLI to get an access token. The OpenAI library automatically reads the OPENAI_API_KEY
and OPENAI_BASE_URL
environment variables to configure the default client.
-
Set common environment variables:
export PROJECT_ID=PROJECT_ID export LOCATION=LOCATION export OPENAI_API_KEY="$(gcloud auth application-default print-access-token)"
-
Set the base URL for your model type:
-
For a Gemini model:
export MODEL_ID=MODEL_ID export OPENAI_BASE_URL="https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/endpoints/openapi"
-
For a self-deployed model from Model Garden:
export ENDPOINT=ENDPOINT_ID export OPENAI_BASE_URL="https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/endpoints/${ENDPOINT}"
-
-
Initialize the client:
The client uses the environment variables you set.
client = openai.OpenAI()
By default, access tokens expire after one hour. You will need to periodically refresh your token and update the OPENAI_API_KEY
environment variable.
Refresh your credentials
Access tokens obtained from Application Default Credentials expire after one hour. For long-running services or applications, you should implement a mechanism to refresh the token. The following example shows a wrapper class that automatically refreshes the token when it expires.
View an example of a credential refresher class
from typing import Any import google.auth import google.auth.transport.requests import openai class OpenAICredentialsRefresher: def __init__(self, **kwargs: Any) -> None: # Set a placeholder key here self.client = openai.OpenAI(**kwargs, api_key="PLACEHOLDER") self.creds, self.project = google.auth.default( scopes=["https://www.googleapis.com/auth/cloud-platform"] ) def __getattr__(self, name: str) -> Any: if not self.creds.valid: self.creds.refresh(google.auth.transport.requests.Request()) if not self.creds.valid: raise RuntimeError("Unable to refresh auth") self.client.api_key = self.creds.token return getattr(self.client, name) # TODO(developer): Update and un-comment below lines # project_id = "PROJECT_ID" # location = "us-central1" client = OpenAICredentialsRefresher( base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi", ) response = client.chat.completions.create( model="google/gemini-2.0-flash-001", messages=[{"role": "user", "content": "Why is the sky blue?"}], ) print(response)
What's next
- See examples of calling the Chat Completions API with the OpenAI-compatible syntax.
- See examples of calling the Inference API with the OpenAI-compatible syntax.
- See examples of calling the Function Calling API with OpenAI-compatible syntax.
- Learn more about the Gemini API.