在构建实时语音智能体时,某些函数调用可能会阻止模型的执行,导致音频串流静音,用户只能在静默中等待。借助 Gemini Live API,所有函数调用默认都是 非阻塞的,这让您可以 与主对话流程 并行执行函数。此过程称为“异步函数调用” 。您的后端可以在后台处理繁重的任务(例如搜索实时航班价格或查询复杂的外部 API),而模型则可以继续收听、说话并与用户进行自然对话。Gemini Live API 可让函数调用在后台处理,而不会中断用户与模型的互动,从而实现更流畅的实时互动。
借助异步函数调用,您可以完成预订预约、设置提醒或提取数据等任务,而无需暂停对话。例如,用户可以请求预订航班,并在后台处理预订的同时立即询问天气信息。
异步函数调用示例
此示例演示了用户预订航班并询问纽约时间,而 book_ticket 函数在后台异步运行:
User: Please book the 2:00 PM flight to New York for me.
Model: function_call: {name: "book_ticket"}
//(The "book_ticket" function call is sent to the client.)
//(Right after the "book_ticket" function call is received, the client sends a text message to the model: "repeat this sentence 'I'm booking your ticket now, please wait.'")
//(The client runs the function call asynchronously in the background.)
Model: I'm booking your ticket now, please wait.
User: What is the current time in New York?
Model: The current time in New York is 12:00pm.
//(Once the book_ticket function finishes, the client sends the result.)
Function_response: {name: "book_ticket", response: {booking_status: "booked"}}
Model: Your flight has been booked. Expect a confirmation text on your phone within 5 minutes.
实现异步函数调用
本部分提供了一系列示例,这些示例使用 Agent Platform SDK 的 Python 版本来构建高度响应的并发架构,该架构使用 Gemini Live API 的异步函数调用功能。这些示例分为以下任务:
定义工具
异步函数调用在模型级别启用,因此您可以在请求配置中指定要使用的工具,就像在 Gemini Enterprise Agent Platform 调用中指定任何标准 Gemini API 一样。这样,模型就可以在您的工具执行时继续对话:
from google import genai
from google.genai import types
# 1. A tool that takes a long time to execute
search_live_flights = {
"name": "search_live_flights",
"description": "Searches airlines for current flight prices. Can take up to 10 seconds."
}
# 2. A tool that executes instantly
get_current_weather = {
"name": "get_current_weather",
"description": "Gets the current weather for a given city."
}
tools = [{"function_declarations": [search_live_flights, get_current_weather]}]
处理来自消息流的函数调用
当模型应调用一个或多个函数时,Gemini Live API 会通过实时消息流发送 tool_call 事件。
您的后端不得阻止该流,因为模型需要保持运行。当您收到对慢速函数(例如 search_live_flights)的调用时,必须将其传递给后台工作器。如果您在主消息循环中直接使用 await 处理 10
秒的任务,则会冻结连接。可以安全地等待快速任务(例如 get_current_weather)。
import asyncio
async def handle_stream(session):
async for response in session.receive():
# Check if the model is asking to use a tool
if response.tool_call is not None:
for fc in response.tool_call.function_calls:
if fc.name == "search_live_flights":
# Pass to a background task so we don't block the receive loop!
asyncio.create_task(background_flight_search(fc.id, fc.args, session))
elif fc.name == "get_current_weather":
# Instant lookups can be safely awaited directly
await instant_weather_lookup(fc.id, fc.args, session)
管理用户预期
为了在长时间运行的异步函数调用期间管理预期,建议客户端发起文本消息。此消息应提示系统告知用户请求正在处理中,并请求用户耐心等待。例如,在客户端收到函数调用后,客户端可以向模型发送文本消息,例如:“重复这句话:‘我正在为您预订机票,请稍等。’”
以下示例对话展示了此交流:
User: Please book the 2:00 PM flight to New York for me.
Model: function_call: {name: "book_ticket"}
//(The "book_ticket" function call is sent to the client.)
//(Right after the "book_ticket" function call is received, the client sends a text message to the model: "repeat this sentence 'I'm booking your ticket now, please wait.'")
//(The client runs the function call asynchronously in the background.)
Model: I'm booking your ticket now, please wait.
User: What is the current time in New York?
Model: The current time in New York is 12:00pm.
//(Once the "book_ticket" function call finishes, the client sends in the response.)
Function_response: {name: "book_ticket", response: {booking_status: "booked"}}
Model: Your flight has been booked. Expect a confirmation text on your phone within 5 minutes.
这种主动消息传递策略具有以下优势:
- 告知用户当前系统操作,从而在长时间运行的函数调用期间管理预期。
- 减少冗余的简短用户提示(例如“你好?”或 “有人在吗?”)的频率。这些提示通常在异步函数调用处理期间系统长时间静默时出现。这可以最大限度地降低因这些重复的用户查询而触发重复函数调用的风险。
- 提供额外的系统提示可以降低在后续互动中创建重复调用的可能性。
处理重复的函数调用
模型在收到第一个调用的响应之前生成重复函数调用的可能性很小。如果您的用例允许,如果同一函数调用的响应仍在等待中,您的应用可以忽略重复的函数调用。
以下示例展示了客户端如何忽略重复的函数调用:
User: Please book the 2:00 PM flight to New York for me.
Model: function_call: {name: "book_ticket"}
//(The "book_ticket" function call is sent to the client. It is running asynchronously in the background.)
User: What is the current time in New York?
Model: The current time in New York is 12:00pm. + function_call: {name: "book_ticket"}
//(The duplicated "book_ticket" can be ignored by the client since the response for the first "book_ticket" has not been sent to the model yet.)
//(The first "book_ticket" function call finishes, and client sends in the response.)
Function_response: {name: "book_ticket", response: {booking_status: "booked"}}
Model: Your flight has been booked. Expect a confirmation text on your phone within 5 minutes.
处理异步函数响应
异步函数调用完成后,您的应用会在 function_response 中将结果发送给模型。当您的后端正在处理
函数调用(例如搜索航班)时,用户可能会向模型提出一个
完全不同的问题,例如“伦敦的天气怎么样?”。模型将与函数调用执行并行地实时响应请求。由于用户在函数执行完成时可能正在与模型进行互动,因此您可以指定一项政策,用于定义模型应如何处理此传入响应。您可以指定以下政策之一:
如需指定政策,请在 function_response 载荷中添加 scheduling 字段:
{
"name": "book_ticket",
"scheduling": "WHEN_IDLE",
"response": {
"booking_status": "booked"
}
}
如果您省略 scheduling 字段,Gemini Live API 将使用其原始方法来处理函数响应,以实现向后兼容性。
以下 Python 示例展示了如何使用 function_response
格式化和发送 scheduling="WHEN_IDLE",以等待对话自然暂停
后再公布结果:
aearcync def background_flight_search(call_id, args, session):
# 1. Simulate a slow API call taking 5 seconds
await asyncio.sleep(5)
flight_data = ["Air Canada AC758: $350", "WestJet WS12: $290"]
# 2. Format the response
function_response = types.FunctionResponse(
id=call_id,
name="search_live_flights",
response={ "status": "success", "flights": flight_data },
scheduling="WHEN_IDLE" # Wait for a moment to tell the user
)
# 3. Send it back into the live session
await session.send_tool_response(function_responses=[function_response])
您可以在 scheduling 字段中指定以下政策来管理函数响应:
SILENT 响应政策
使用 SILENT 政策时,函数响应会添加到模型的上下文中,但模型不会为其生成响应,并且任何正在进行的用户互动都不会中断。
User: Please book the 2:00 PM flight to New York for me.
Model: function_call: {name: "book_ticket"}
//(The book_ticket function call is sent to the client and starts running asynchronously in the background.)
User: What is the current time in New York?
Model: The current time in New York is 12:00pm.
//(The book_ticket function finishes, and client sends the result with scheduling: "SILENT".)
Function_response: {name: "book_ticket", scheduling: "SILENT", response: {booking_status: "booked"}}
//(The model doesn't generate a response for the function response.)
User: Is my flight ticket booked?
Model: Yes. Your flight has been booked.
WHEN_IDLE 响应政策
使用 WHEN_IDLE 政策时,模型仅在没有活跃用户互动时才会生成对函数响应的响应。如果用户互动正在进行中,模型会等待其完成,然后再生成响应以避免中断。
User: Please book the 2:00 PM flight to New York for me.
Model: function_call: {name: "book_ticket"}
//(The book_ticket function call is sent to the client and starts running asynchronously in the background.)
User: What is the current time in New York?
//(The book_ticket function finishes, and client sends the result with scheduling: "WHEN_IDLE".)
Function_response: {name: "book_ticket", scheduling: "WHEN_IDLE", response: {booking_status: "booked"}}
//(The ongoing interaction about the time is not interrupted.)
Model: The current time in New York is 12:00pm.
//(After responding to the user's time query, the model issues the response for the book_ticket function.)
Model: Your flight has been booked. Expect a confirmation text on your phone within 5 minutes.
INTERRUPT 响应政策
使用 INTERRUPT 政策时,模型会立即生成对函数响应的响应,从而中断任何正在进行的用户互动。
User: Please book the 2:00 PM flight to New York for me.
Model: function_call: {name: "book_ticket"}
//(The book_ticket function call is sent to the client and starts running asynchronously in the background.)
User: What is the current time in New York?
//(The book_ticket function finishes, and client sends the result with scheduling: "INTERRUPT".)
Function_response: {name: "book_ticket", scheduling: "INTERRUPT", response: {booking_status: "booked"}}
//(The ongoing interaction about the time is interrupted, and model skips responding to it.)
Model: Your flight has been booked. Expect a confirmation text on your phone within 5 minutes.
最佳做法
- 针对并发进行设计:始终将慢速工具(例如查询外部 API 或运行 RAG 流水线)卸载到后端的后台任务中。让模型继续处理活跃音频串流。
- 除非必要,否则避免使用 INTERRUPT:将
INTERRUPT用于重要提醒。 对于后台任务,SILENT或WHEN_IDLE可提供更流畅、更礼貌的用户体验。 - 独立的聊天轮次:在 Gemini Live API 中,工具执行 完全独立于聊天轮次。当您的工具在后台处理时,对话可以自然地分支、继续和流动。
- “静默”注意事项 :即使安排为
SILENT,模型有时仍可能会尝试以口头方式叙述工具的执行情况。如需强制实现真正的静默,请向系统指令添加明确的防护措施(例如,“使用 [工具名称] 时,执行静默执行,不要说话”),或者使用“即发即弃”后端模式,即根本不向模型发送FunctionResponse。