支援 Live API 的模型內建可使用下列工具:
如要啟用特定工具,以便在傳回的回應中使用,請在初始化模型時,在 tools
清單中加入該工具的名稱。以下各節將提供程式碼中各項內建工具的使用範例。
支援的模型
您可以將 Live API 與下列模型搭配使用:
模型版本 | 可用性層級 |
---|---|
gemini-live-2.5-flash |
私人 GA* |
gemini-live-2.5-flash-preview-native-audio |
公開預先發布版 |
* 請與 Google 帳戶團隊代表聯絡,要求取得存取權。
函式呼叫
使用函式呼叫功能建立函式說明,然後透過要求將說明傳遞至模型。模型的回應會提供與說明相符的函式名稱,以及用來呼叫這個函式的引數。
您必須在工作階段開始時宣告所有函式,方法是將工具定義傳送至 LiveConnectConfig
訊息。
如要啟用函式呼叫,請在 tools
清單中加入 function_declarations
:
Python 適用的 Gen AI SDK
import asyncio from google import genai from google.genai import types client = genai.Client( vertexai=True, project=GOOGLE_CLOUD_PROJECT, location=GOOGLE_CLOUD_LOCATION, ) model = "gemini-live-2.5-flash" # Simple function definitions turn_on_the_lights = {"name": "turn_on_the_lights"} turn_off_the_lights = {"name": "turn_off_the_lights"} tools = [{"function_declarations": [turn_on_the_lights, turn_off_the_lights]}] config = {"response_modalities": ["TEXT"], "tools": tools} async def main(): async with client.aio.live.connect(model=model, config=config) as session: prompt = "Turn on the lights please" await session.send_client_content(turns={"parts": [{"text": prompt}]}) async for chunk in session.receive(): if chunk.server_content: if chunk.text is not None: print(chunk.text) elif chunk.tool_call: function_responses = [] for fc in tool_call.function_calls: function_response = types.FunctionResponse( name=fc.name, response={ "result": "ok" } # simple, hard-coded function response ) function_responses.append(function_response) await session.send_tool_response(function_responses=function_responses) if __name__ == "__main__": asyncio.run(main())
WebSocket
程式碼執行
您可以搭配 Live API 使用程式碼執行功能,直接產生及執行 Python 程式碼。如要啟用回應的程式碼執行作業,請在 tools
清單中加入 code_execution
:
Python 適用的 Gen AI SDK
import asyncio from google import genai from google.genai import types client = genai.Client( vertexai=True, project=GOOGLE_CLOUD_PROJECT, location=GOOGLE_CLOUD_LOCATION, ) model = "gemini-live-2.5-flash" tools = [{'code_execution': {}}] config = {"response_modalities": ["TEXT"], "tools": tools} async def main(): async with client.aio.live.connect(model=model, config=config) as session: prompt = "Compute the largest prime palindrome under 100000." await session.send_client_content(turns={"parts": [{"text": prompt}]}) async for chunk in session.receive(): if chunk.server_content: if chunk.text is not None: print(chunk.text) model_turn = chunk.server_content.model_turn if model_turn: for part in model_turn.parts: if part.executable_code is not None: print(part.executable_code.code) if part.code_execution_result is not None: print(part.code_execution_result.output) if __name__ == "__main__": asyncio.run(main())
利用 Google 搜尋建立基準
您可以將 google_search
納入 tools
清單,搭配 Live API 使用「以 Google 搜尋建立基準」功能:
Python 適用的 Gen AI SDK
import asyncio from google import genai from google.genai import types client = genai.Client( vertexai=True, project=GOOGLE_CLOUD_PROJECT, location=GOOGLE_CLOUD_LOCATION, ) model = "gemini-live-2.5-flash" tools = [{'google_search': {}}] config = {"response_modalities": ["TEXT"], "tools": tools} async def main(): async with client.aio.live.connect(model=model, config=config) as session: prompt = "When did the last Brazil vs. Argentina soccer match happen?" await session.send_client_content(turns={"parts": [{"text": prompt}]}) async for chunk in session.receive(): if chunk.server_content: if chunk.text is not None: print(chunk.text) # The model might generate and execute Python code to use Search model_turn = chunk.server_content.model_turn if model_turn: for part in model_turn.parts: if part.executable_code is not None: print(part.executable_code.code) if part.code_execution_result is not None: print(part.code_execution_result.output) if __name__ == "__main__": asyncio.run(main())
使用 Vertex AI RAG 引擎建立基準 (預先發布版)
您可以搭配使用 Vertex AI RAG Engine 和 Live API,以便建立、儲存及擷取情境:
from google import genai
from google.genai import types
from google.genai.types import (Content, LiveConnectConfig, HttpOptions, Modality, Part)
from IPython import display
PROJECT_ID=YOUR_PROJECT_ID
LOCATION=YOUR_LOCATION
TEXT_INPUT=YOUR_TEXT_INPUT
MODEL_NAME="gemini-live-2.5-flash"
client = genai.Client(
vertexai=True,
project=PROJECT_ID,
location=LOCATION,
)
rag_store=types.VertexRagStore(
rag_resources=[
types.VertexRagStoreRagResource(
rag_corpus=<Your corpus resource name> # Use memory corpus if you want to store context.
)
],
# Set `store_context` to true to allow Live API sink context into your memory corpus.
store_context=True
)
async with client.aio.live.connect(
model=MODEL_NAME,
config=LiveConnectConfig(response_modalities=[Modality.TEXT],
tools=[types.Tool(
retrieval=types.Retrieval(
vertex_rag_store=rag_store))]),
) as session:
text_input=TEXT_INPUT
print("> ", text_input, "\n")
await session.send_client_content(
turns=Content(role="user", parts=[Part(text=text_input)])
)
async for message in session.receive():
if message.text:
display.display(display.Markdown(message.text))
continue
詳情請參閱「在 Gemini Live API 中使用 Vertex AI RAG Engine」。
(公開測試版) 原生音訊
Gemini 2.5 Flash with Live API 推出原生音訊功能,強化標準 Live API 功能。原生音訊提供 30 種 HD 高音質語音,支援 24 種語言,可提供更豐富、更自然的語音互動。這項功能也包含兩項僅適用於原生音訊的新功能: 主動式音訊和 情感對話。
使用主動音訊
主動式音訊可讓模型只在適當情況下回應。啟用後,模型會主動產生文字轉錄稿和音訊回覆,但僅針對導向裝置的查詢。系統會忽略非裝置導向的查詢。
如要使用主動式音訊,請在設定訊息中設定 proactivity
欄位,並將 proactive_audio
設為 true
:
Python 適用的 Gen AI SDK
config = LiveConnectConfig( response_modalities=["AUDIO"], proactivity=ProactivityConfig(proactive_audio=True), )
使用情緒感知對話
情感對話可讓使用 Live API 原生音訊的模型,更能理解使用者的情緒表達方式,並做出適當回應,進而帶來更細緻的對話。
如要啟用情緒感知對話功能,請在設定訊息中將 enable_affective_dialog
設為 true
:
Python 適用的 Gen AI SDK
config = LiveConnectConfig( response_modalities=["AUDIO"], enable_affective_dialog=True, )
更多資訊
如要進一步瞭解如何使用 Live API,請參閱: