Live API 的内置工具

支持 Live API 的模型内置了使用以下工具的功能:

如需启用特定工具以在返回的响应中使用,请在初始化模型时在 tools 列表中添加该工具的名称。以下部分提供了示例,说明如何在代码中使用每种内置工具。

支持的模型

您可以将 Live API 与以下模型搭配使用:

模型版本 可用性等级
gemini-live-2.5-flash 非公开正式版 GA*
gemini-live-2.5-flash-preview-native-audio 公开预览版

*请与您的 Google 客户支持团队代表联系,申请访问权限。

函数调用

使用函数调用创建函数的说明,然后在请求中将该说明传递给模型。模型的响应包括与说明匹配的函数名称以及调用该函数的参数。

必须在会话开始时声明所有函数,方法是将工具定义作为 LiveConnectConfig 消息的一部分发送。

如需启用函数调用,请在 tools 列表中添加 function_declarations

Gen AI SDK for Python

import asyncio
from google import genai
from google.genai import types

client = genai.Client(
    vertexai=True,
    project=GOOGLE_CLOUD_PROJECT,
    location=GOOGLE_CLOUD_LOCATION,
)
model = "gemini-live-2.5-flash"

# Simple function definitions
turn_on_the_lights = {"name": "turn_on_the_lights"}
turn_off_the_lights = {"name": "turn_off_the_lights"}

tools = [{"function_declarations": [turn_on_the_lights, turn_off_the_lights]}]
config = {"response_modalities": ["TEXT"], "tools": tools}

async def main():
    async with client.aio.live.connect(model=model, config=config) as session:
        prompt = "Turn on the lights please"
        await session.send_client_content(turns={"parts": [{"text": prompt}]})

        async for chunk in session.receive():
            if chunk.server_content:
                if chunk.text is not None:
                    print(chunk.text)
            elif chunk.tool_call:
                function_responses = []
                for fc in tool_call.function_calls:
                    function_response = types.FunctionResponse(
                        name=fc.name,
                        response={ "result": "ok" } # simple, hard-coded function response
                    )
                    function_responses.append(function_response)

                await session.send_tool_response(function_responses=function_responses)


if __name__ == "__main__":
    asyncio.run(main())
  

WebSockets

代码执行

您可以将代码执行功能与 Live API 搭配使用,直接生成和执行 Python 代码。如需为响应启用代码执行,请在 tools 列表中添加 code_execution

Gen AI SDK for Python

import asyncio
from google import genai
from google.genai import types


client = genai.Client(
    vertexai=True,
    project=GOOGLE_CLOUD_PROJECT,
    location=GOOGLE_CLOUD_LOCATION,
)
model = "gemini-live-2.5-flash"

tools = [{'code_execution': {}}]
config = {"response_modalities": ["TEXT"], "tools": tools}

async def main():
    async with client.aio.live.connect(model=model, config=config) as session:
        prompt = "Compute the largest prime palindrome under 100000."
        await session.send_client_content(turns={"parts": [{"text": prompt}]})

        async for chunk in session.receive():
            if chunk.server_content:
                if chunk.text is not None:
                    print(chunk.text)
            
                model_turn = chunk.server_content.model_turn
                if model_turn:
                    for part in model_turn.parts:
                      if part.executable_code is not None:
                        print(part.executable_code.code)

                      if part.code_execution_result is not None:
                        print(part.code_execution_result.output)

if __name__ == "__main__":
    asyncio.run(main())
  

您可以将 Grounding with Google Search 与 Live API 搭配使用,方法是将 google_search 添加到 tools 列表中:

Gen AI SDK for Python

import asyncio
from google import genai
from google.genai import types

client = genai.Client(
    vertexai=True,
    project=GOOGLE_CLOUD_PROJECT,
    location=GOOGLE_CLOUD_LOCATION,
)
model = "gemini-live-2.5-flash"


tools = [{'google_search': {}}]
config = {"response_modalities": ["TEXT"], "tools": tools}

async def main():
    async with client.aio.live.connect(model=model, config=config) as session:
        prompt = "When did the last Brazil vs. Argentina soccer match happen?"
        await session.send_client_content(turns={"parts": [{"text": prompt}]})

        async for chunk in session.receive():
            if chunk.server_content:
                if chunk.text is not None:
                    print(chunk.text)

                # The model might generate and execute Python code to use Search
                model_turn = chunk.server_content.model_turn
                if model_turn:
                    for part in model_turn.parts:
                        if part.executable_code is not None:
                        print(part.executable_code.code)

                        if part.code_execution_result is not None:
                        print(part.code_execution_result.output)

if __name__ == "__main__":
    asyncio.run(main())
  

依托 Vertex AI RAG Engine(预览版)进行接地

您可以将 Vertex AI RAG 引擎与 Live API 搭配使用,以对上下文进行基准化、存储和检索:

from google import genai
from google.genai import types
from google.genai.types import (Content, LiveConnectConfig, HttpOptions, Modality, Part)
from IPython import display

PROJECT_ID=YOUR_PROJECT_ID
LOCATION=YOUR_LOCATION
TEXT_INPUT=YOUR_TEXT_INPUT
MODEL_NAME="gemini-live-2.5-flash"

client = genai.Client(
   vertexai=True,
   project=PROJECT_ID,
   location=LOCATION,
)

rag_store=types.VertexRagStore(
   rag_resources=[
       types.VertexRagStoreRagResource(
           rag_corpus=<Your corpus resource name>  # Use memory corpus if you want to store context.
       )
   ],
   # Set `store_context` to true to allow Live API sink context into your memory corpus.
   store_context=True
)

async with client.aio.live.connect(
   model=MODEL_NAME,
   config=LiveConnectConfig(response_modalities=[Modality.TEXT],
                            tools=[types.Tool(
                                retrieval=types.Retrieval(
                                    vertex_rag_store=rag_store))]),
) as session:
   text_input=TEXT_INPUT
   print("> ", text_input, "\n")
   await session.send_client_content(
       turns=Content(role="user", parts=[Part(text=text_input)])
   )

   async for message in session.receive():
       if message.text:
           display.display(display.Markdown(message.text))
           continue

如需了解详情,请参阅在 Gemini Live API 中使用 Vertex AI RAG Engine

(公开预览版)原生音频

Gemini 2.5 Flash with Live API 引入了原生音频功能,增强了标准 Live API 功能。原生音频通过 24 种语言30 种 HD 语音,提供更丰富、更自然的语音互动。它还包含两项仅适用于原生音频的新功能: 主动音频 情感对话

使用主动音频

主动音频可让模型仅在相关时做出回应。启用此功能后,该模型会主动生成文本转写和音频回答,但针对指向设备的查询。系统会忽略非设备定向查询。

如需使用主动式音频,请在设置消息中配置 proactivity 字段,并将 proactive_audio 设置为 true

Gen AI SDK for Python

config = LiveConnectConfig(
    response_modalities=["AUDIO"],
    proactivity=ProactivityConfig(proactive_audio=True),
)
  

使用共情对话

借助情感对话功能,使用 Live API 原生音频的模型可以更好地理解用户的情感表达并做出适当的回应,从而实现更细致的对话。

如需启用情感对话功能,请在设置消息中将 enable_affective_dialog 设置为 true

Gen AI SDK for Python

config = LiveConnectConfig(
    response_modalities=["AUDIO"],
    enable_affective_dialog=True,
)
  

更多信息

如需详细了解如何使用 Live API,请参阅: