Starting April 29, 2025, Gemini 1.5 Pro and Gemini 1.5 Flash models are not available in projects that have no prior usage of these models, including new projects. For details, see Model versions and lifecycle.

Authenticate

To use the OpenAI Python libraries with Vertex AI, you need to authenticate using Google credentials and configure your client to use a Vertex AI endpoint. This document shows you how to set up your environment and authenticate using two different methods.

Before you begin

Prerequisites

1. Install SDKs

Install the OpenAI and Google Auth SDKs:

pip install openai google-auth requests

2. Set up authentication

To authenticate to Vertex AI, set up Application Default Credentials (ADC). For more information, see Set up Application Default Credentials.

3. Identify your endpoint

Your endpoint depends on the type of model you are calling:

Gemini models: Use openapi as the endpoint ID.
Self-deployed models: Certain models in Model Garden and supported Hugging Face models need to be deployed first before they can serve requests. When calling these models, you must specify the unique endpoint ID of your deployment.

Authentication methods

You can authenticate by either configuring the client object directly in your code or by setting environment variables. The following workflow outlines the process.

Flow Chart

Choose the method that best suits your use case.

Method	Advantages	Disadvantages	Best For
Client Setup	Programmatic and flexible. Allows for dynamic credential management within the application.	Requires more code to manage credentials and endpoints directly.	Applications that need to manage multiple clients or refresh credentials dynamically.
Environment Variables	Simple setup that separates configuration from code. Ideal for local development and testing.	Less secure if not managed properly. Less flexible for dynamic credential changes.	Quickstarts, local development, and containerized deployments where environment variables are standard.

Client setup

You can programmatically get Google credentials and configure the OpenAI client in your Python code. By default, access tokens expire after one hour. For long-running applications, see how to refresh your credentials.

View Python code for client setup

import openai

from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "us-central1"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())
# Note: the credential lives for 1 hour by default (https://cloud.google.com/docs/authentication/token-types#at-lifetime); after expiration, it must be refreshed.

##############################
# Choose one of the following:
##############################

# If you are calling a Gemini model, set the ENDPOINT_ID variable to use openapi.
ENDPOINT_ID = "openapi"

# If you are calling a self-deployed model from Model Garden, set the
# ENDPOINT_ID variable and set the client's base URL to use your endpoint.
# ENDPOINT_ID = "YOUR_ENDPOINT_ID"

# OpenAI Client
client = openai.OpenAI(
    base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/{ENDPOINT_ID}",
    api_key=credentials.token,
)

Environment variables

You can use the Google Cloud CLI to get an access token. The OpenAI library automatically reads the OPENAI_API_KEY and OPENAI_BASE_URL environment variables to configure the default client.

Set common environment variables:

export PROJECT_ID=PROJECT_ID
export LOCATION=LOCATION
export OPENAI_API_KEY="$(gcloud auth application-default print-access-token)"

Set the base URL for your model type:

For a Gemini model:

export MODEL_ID=MODEL_ID
export OPENAI_BASE_URL="https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/endpoints/openapi"

For a self-deployed model from Model Garden:

export ENDPOINT=ENDPOINT_ID
export OPENAI_BASE_URL="https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/endpoints/${ENDPOINT}"

Initialize the client:

The client uses the environment variables you set.
```
client = openai.OpenAI()
        
```

By default, access tokens expire after one hour. You will need to periodically refresh your token and update the OPENAI_API_KEY environment variable.

Refresh your credentials

Access tokens obtained from Application Default Credentials expire after one hour. For long-running services or applications, you should implement a mechanism to refresh the token. The following example shows a wrapper class that automatically refreshes the token when it expires.

View an example of a credential refresher class

from typing import Any

import google.auth
import google.auth.transport.requests
import openai


class OpenAICredentialsRefresher:
    def __init__(self, **kwargs: Any) -> None:
        # Set a placeholder key here
        self.client = openai.OpenAI(**kwargs, api_key="PLACEHOLDER")
        self.creds, self.project = google.auth.default(
            scopes=["https://www.googleapis.com/auth/cloud-platform"]
        )

    def __getattr__(self, name: str) -> Any:
        if not self.creds.valid:
            self.creds.refresh(google.auth.transport.requests.Request())

            if not self.creds.valid:
                raise RuntimeError("Unable to refresh auth")

            self.client.api_key = self.creds.token
        return getattr(self.client, name)



# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# location = "us-central1"

client = OpenAICredentialsRefresher(
    base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
)

response = client.chat.completions.create(
    model="google/gemini-2.0-flash-001",
    messages=[{"role": "user", "content": "Why is the sky blue?"}],
)

print(response)

What's next

See examples of calling the Chat Completions API with the OpenAI-compatible syntax.
See examples of calling the Inference API with the OpenAI-compatible syntax.
See examples of calling the Function Calling API with OpenAI-compatible syntax.
Learn more about the Gemini API.

Authenticate Stay organized with collections Save and categorize content based on your preferences.

Before you begin

Prerequisites

1. Install SDKs

2. Set up authentication

3. Identify your endpoint

Authentication methods

Client setup

View Python code for client setup

Environment variables

Refresh your credentials

View an example of a credential refresher class

What's next

Authenticate