This tutorial shows you how to connect to the Live API using WebSockets. In this tutorial, you'll set up a Google Cloud project to use the Live API with WebSockets, send an audio file to the model, and receive audio in response.
For more information about WebSockets, see the WebSockets API documentation.
Before you begin
Before you can send requests, you need to set up authentication with Vertex AI. You can either set up authentication using an API key or using application default credentials (ADC).
For this tutorial, the quickest way to get started is by using an API key:
If you're a new user to Google Cloud, get an express mode API key.
If you already have a Google Cloud project, get a Google Cloud API key that's bound to a service account. Binding an API key to a service account is possible only if it's enabled in the organization policy settings. If you can't enable this setting, use application default credentials instead.
For instructions on setting up authentication using ADC instead, see our quickstart.
Install the WebSockets library
Run the following to install the websockets library:
pip install websockets
Set up environment variables
Set environment variables for your project ID and location. Replace PROJECT_ID with your
Google Cloud project ID.
export GOOGLE_CLOUD_PROJECT=PROJECT_ID
export GOOGLE_CLOUD_LOCATION=global
Start an audio session
The following example establishes a session, streams audio from a file, and prints the size of the audio chunks received in the response.
import asyncio
import websockets
import json
import base64
import os
import sys
# Replace the [PROJECT_ID] and [LOCATION] with your Google Cloud Project ID and location.
PROJECT_ID = os.environ.get("GOOGLE_CLOUD_PROJECT")
LOCATION = os.environ.get("GOOGLE_CLOUD_LOCATION")
# Authentication
token_list = !gcloud auth application-default print-access-token
ACCESS_TOKEN = token_list[0]
# Configuration
MODEL_ID = "gemini-live-2.5-flash-preview-native-audio-09-2025"
# Construct the WSS URL
HOST = f"{LOCATION}-aiplatform.googleapis.com"
path = "google.cloud.aiplatform.v1.LlmBidiService/BidiGenerateContent"
URI = f"wss://{HOST}/ws/{path}"
MODEL_RESOURCE = f"projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}"
async def main():
headers = {"Authorization": f"Bearer {ACCESS_TOKEN}"}
async with websockets.connect(URI, additional_headers=headers) as ws:
print("Session established.")
# Send Setup (Handshake)
await ws.send(json.dumps({
"setup": {
"model": MODEL_RESOURCE,
"generation_config": { "response_modalities": ["AUDIO"] }
}
}))
# Define Tasks
async def send_audio():
# Download sample if missing
if not os.path.exists("input.wav"):
!wget -q https://storage.googleapis.com/cloud-samples-data/generative-ai/audio/where_the_nearest_train_station_is.wav -O input.wav
with open("input.wav", "rb") as f:
while chunk := f.read(1024):
msg = {
"realtime_input": {
"media_chunks": [{
"mime_type": "audio/pcm;rate=16000",
"data": base64.b64encode(chunk).decode("utf-8")
}]
}
}
await ws.send(json.dumps(msg))
await asyncio.sleep(0.01)
print("Done sending audio.")
async def receive_audio():
async for msg in ws:
data = json.loads(msg)
try:
parts = data["serverContent"]["modelTurn"]["parts"]
for part in parts:
if "inlineData" in part:
b64_audio = part["inlineData"]["data"]
print(f"Received chunk: {len(b64_audio)} bytes")
except KeyError:
pass
if data.get("serverContent", {}).get("turnComplete"):
print("Turn complete. Exiting.")
break
# Run Together
await asyncio.gather(send_audio(), receive_audio())
if __name__ == "__main__":
# Check if running in Jupyter/Colab
if "ipykernel" in sys.modules:
await main()
else:
asyncio.run(main())