Get started with Live API using WebSockets

This tutorial shows you how to connect to the Live API using WebSockets. In this tutorial, you'll set up a Google Cloud project to use the Live API with WebSockets, send an audio file to the model, and receive audio in response.

For more information about WebSockets, see the WebSockets API documentation.

Before you begin

Before you can send requests, you need to set up authentication with Vertex AI. You can either set up authentication using an API key or using application default credentials (ADC).

For this tutorial, the quickest way to get started is by using an API key:

For instructions on setting up authentication using ADC instead, see our quickstart.

Install the WebSockets library

Run the following to install the websockets library:

pip install websockets

Set up environment variables

Set environment variables for your project ID and location. Replace PROJECT_ID with your Google Cloud project ID.

export GOOGLE_CLOUD_PROJECT=PROJECT_ID
export GOOGLE_CLOUD_LOCATION=global

Start an audio session

The following example establishes a session, streams audio from a file, and prints the size of the audio chunks received in the response.

import asyncio
import websockets
import json
import base64
import os
import sys

# Replace the [PROJECT_ID] and [LOCATION] with your Google Cloud Project ID and location.
PROJECT_ID = os.environ.get("GOOGLE_CLOUD_PROJECT")
LOCATION = os.environ.get("GOOGLE_CLOUD_LOCATION")

# Authentication
token_list = !gcloud auth application-default print-access-token
ACCESS_TOKEN = token_list[0]

# Configuration
MODEL_ID = "gemini-live-2.5-flash-preview-native-audio-09-2025"

# Construct the WSS URL
HOST = f"{LOCATION}-aiplatform.googleapis.com"
path = "google.cloud.aiplatform.v1.LlmBidiService/BidiGenerateContent"
URI = f"wss://{HOST}/ws/{path}"
MODEL_RESOURCE = f"projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}"


async def main():
   headers = {"Authorization": f"Bearer {ACCESS_TOKEN}"}
  
   async with websockets.connect(URI, additional_headers=headers) as ws:
       print("Session established.")

       # Send Setup (Handshake)
       await ws.send(json.dumps({
           "setup": {
               "model": MODEL_RESOURCE,
               "generation_config": { "response_modalities": ["AUDIO"] }
           }
       }))

       # Define Tasks
       async def send_audio():
           # Download sample if missing
           if not os.path.exists("input.wav"):
               !wget -q https://storage.googleapis.com/cloud-samples-data/generative-ai/audio/where_the_nearest_train_station_is.wav -O input.wav

           with open("input.wav", "rb") as f:
               while chunk := f.read(1024):
                   msg = {
                       "realtime_input": {
                           "media_chunks": [{
                               "mime_type": "audio/pcm;rate=16000",
                               "data": base64.b64encode(chunk).decode("utf-8")
                           }]
                       }
                   }
                   await ws.send(json.dumps(msg))
                   await asyncio.sleep(0.01)
           print("Done sending audio.")

       async def receive_audio():
           async for msg in ws:
               data = json.loads(msg)
               try:
                   parts = data["serverContent"]["modelTurn"]["parts"]
                   for part in parts:
                       if "inlineData" in part:
                           b64_audio = part["inlineData"]["data"]
                           print(f"Received chunk: {len(b64_audio)} bytes")
               except KeyError:
                   pass

               if data.get("serverContent", {}).get("turnComplete"):
                   print("Turn complete. Exiting.")
                   break

       # Run Together
       await asyncio.gather(send_audio(), receive_audio())

if __name__ == "__main__":
   # Check if running in Jupyter/Colab
   if "ipykernel" in sys.modules:
       await main()
   else:
       asyncio.run(main())

What's next