使用 Gemini Live API 進行非同步函式呼叫

注意： gemini-live-2.5-flash-preview-native-audio-09-2025 將於 2026 年 3 月 19 日淘汰並移除。將任何工作流程遷移至 gemini-live-2.5-flash-native-audio。

建構即時語音代理程式時，部分函式呼叫可能會封鎖模型執行作業，導致音訊串流靜音，使用者只能在無聲狀態下等待。使用 Gemini Live API 時，所有函式呼叫預設都是非封鎖，因此您可以執行函式，與主要對話流程並行。這項程序稱為「非同步函式呼叫」。後端可以在背景處理耗用大量資源的工作，例如搜尋即時航班價格或查詢複雜的外部 API，而模型會繼續聆聽、說話，並與使用者自然對話。Gemini Live API 可在背景處理函式呼叫，不會中斷使用者與模型的互動，讓互動更流暢即時。

非同步函式呼叫功能可讓您完成預約、設定提醒或擷取資料等工作，不必暫停對話。舉例來說，使用者可以要求預訂航班，並在系統於背景處理預訂作業時，立即詢問天氣資訊。

非同步函式呼叫範例

這個範例說明使用者預訂航班，並要求提供紐約時間，而 book_ticket 函式則在背景中非同步執行：

User: Please book the 2:00 PM flight to New York for me.

Model: function_call: {name: "book_ticket"}
//(The "book_ticket" function call is sent to the client.)
//(Right after the "book_ticket" function call is received, the client sends a text message to the model: "repeat this sentence 'I'm booking your ticket now, please wait.'")
//(The client runs the function call asynchronously in the background.)
Model: I'm booking your ticket now, please wait.

User: What is the current time in New York?

Model: The current time in New York is 12:00pm.

//(Once the book_ticket function finishes, the client sends the result.)
Function_response: {name: "book_ticket", response: {booking_status: "booked"}}

Model: Your flight has been booked. Expect a confirmation text on your phone within 5 minutes.

實作非同步函式呼叫

本節提供一系列範例，說明如何使用 Agent Platform SDK 的 Python 版本，建構高回應速度的並行架構，並運用 Gemini Live API 的非同步函式呼叫功能。這些範例分為下列工作：

定義工具
處理訊息串流中的函式呼叫
管理使用者期望

定義工具

非同步函式呼叫功能是在模型層級啟用，因此您可以在要求設定中指定要使用的工具，就像在 Gemini Enterprise Agent Platform 呼叫中，對任何標準 Gemini API 執行這項操作一樣。這樣一來，模型就能在工具執行時繼續對話：

from google import genai
from google.genai import types

# 1. A tool that takes a long time to execute
search_live_flights = {
    "name": "search_live_flights",
    "description": "Searches airlines for current flight prices. Can take up to 10 seconds."
}

# 2. A tool that executes instantly
get_current_weather = {
    "name": "get_current_weather",
    "description": "Gets the current weather for a given city."
}

tools = [{"function_declarations": [search_live_flights, get_current_weather]}]

處理訊息串流中的函式呼叫

當模型應呼叫一或多個函式時，Gemini Live API 會透過即時訊息串流傳送 tool_call 事件。

後端不得封鎖串流，因為模型預期會持續執行。收到慢速函式 (例如 search_live_flights) 的呼叫時，您必須將其傳遞至背景工作者。如果您在 10 秒任務的主要訊息迴圈中直接使用 await，連線就會凍結。可以安全地等待快速工作 (例如 get_current_weather)。

import asyncio

async def handle_stream(session):
    async for response in session.receive():
        # Check if the model is asking to use a tool
        if response.tool_call is not None:
            for fc in response.tool_call.function_calls:

                if fc.name == "search_live_flights":
                    # Pass to a background task so we don't block the receive loop!
                    asyncio.create_task(background_flight_search(fc.id, fc.args, session))

                elif fc.name == "get_current_weather":
                    # Instant lookups can be safely awaited directly
                    await instant_weather_lookup(fc.id, fc.args, session)

管理使用者期望

為管理長時間執行的非同步函式呼叫期間的預期情況，建議用戶端發起簡訊。這則訊息應提示系統通知使用者要求正在處理中，並請他們耐心等候。舉例來說，用戶端收到函式呼叫後，可以傳送文字訊息給模型，例如：「repeat this sentence: 'I'm booking your ticket now, please wait.'」(重複這句話：「我現在正在為你訂票，請稍候。」)。

以下範例對話方塊顯示這項交換作業：

User: Please book the 2:00 PM flight to New York for me.
Model: function_call: {name: "book_ticket"}
//(The "book_ticket" function call is sent to the client.)
//(Right after the "book_ticket" function call is received, the client sends a text message to the model: "repeat this sentence 'I'm booking your ticket now, please wait.'")
//(The client runs the function call asynchronously in the background.)
Model: I'm booking your ticket now, please wait.
User: What is the current time in New York?
Model: The current time in New York is 12:00pm.
//(Once the "book_ticket" function call finishes, the client sends in the response.)
Function_response: {name: "book_ticket", response: {booking_status: "booked"}}
Model: Your flight has been booked. Expect a confirmation text on your phone within 5 minutes.

這種主動傳訊策略有以下優點：

向使用者說明目前的系統作業，以便在長時間執行的函式呼叫期間管理預期行為。
減少重複的簡短使用者提示，例如「你好？」或「你在嗎？」。這類情況通常發生在系統長時間處於靜止狀態，同時處理非同步函式呼叫時。這有助於盡量避免因使用者重複查詢而觸發重複的函式呼叫。
提供額外的系統提示，可降低後續互動中建立重複通話的機率。

處理重複的函式呼叫

模型在收到第一次呼叫的回應前，可能會重複呼叫函式。如果您的用途允許，應用程式可以忽略重複的函式呼叫，前提是相同函式呼叫的回應仍在等待中。

以下範例說明用戶端如何忽略重複的函式呼叫：

User: Please book the 2:00 PM flight to New York for me.

Model: function_call: {name: "book_ticket"}
//(The "book_ticket" function call is sent to the client. It is running asynchronously in the background.)

User: What is the current time in New York?
Model: The current time in New York is 12:00pm. + function_call: {name: "book_ticket"}
//(The duplicated "book_ticket" can be ignored by the client since the response for the first "book_ticket" has not been sent to the model yet.)

//(The first "book_ticket" function call finishes, and client sends in the response.)
Function_response: {name: "book_ticket", response: {booking_status: "booked"}}

Model: Your flight has been booked. Expect a confirmation text on your phone within 5 minutes.

處理非同步函式回應

非同步函式呼叫完成後，應用程式會透過 function_response 將結果傳送至模型。後端處理函式呼叫 (例如搜尋航班) 時，使用者可能會向模型提出完全不同的問題，例如「倫敦天氣如何？」。模型會即時回應要求，並同時執行函式呼叫。由於使用者可能在函式執行完成時與模型互動，您可以指定政策，定義模型應如何處理這項傳入的回應。您可以指定下列其中一項政策：

SILENT
WHEN_IDLE
INTERRUPT

如要指定政策，請在 function_response 酬載中加入 scheduling 欄位：

{
  "name": "book_ticket",
  "scheduling": "WHEN_IDLE",
  "response": {
    "booking_status": "booked"
  }
}

如果省略 scheduling 欄位，Gemini Live API 會使用原始方法處理函式回應，以確保回溯相容性。

以下 Python 範例說明如何格式化及傳送 function_response，並使用 scheduling="WHEN_IDLE" 在對話自然停頓時宣布結果：

aearcync def background_flight_search(call_id, args, session):
    # 1. Simulate a slow API call taking 5 seconds
    await asyncio.sleep(5)
    flight_data = ["Air Canada AC758: $350", "WestJet WS12: $290"]

    # 2. Format the response
    function_response = types.FunctionResponse(
        id=call_id,
        name="search_live_flights",
        response={ "status": "success", "flights": flight_data },
        scheduling="WHEN_IDLE" # Wait for a moment to tell the user
    )

    # 3. Send it back into the live session
    await session.send_tool_response(function_responses=[function_response])

您可以在 scheduling 欄位中指定下列政策，管理函式回應：

SILENT 回應政策

使用 SILENT 政策時，函式回應會新增至模型的脈絡，但模型不會為此生成回應，也不會中斷任何進行中的使用者互動。

User: Please book the 2:00 PM flight to New York for me.

Model: function_call: {name: "book_ticket"}
//(The book_ticket function call is sent to the client and starts running asynchronously in the background.)

User: What is the current time in New York?
Model: The current time in New York is 12:00pm.

//(The book_ticket function finishes, and client sends the result with scheduling: "SILENT".)
Function_response: {name: "book_ticket", scheduling: "SILENT", response: {booking_status: "booked"}}
//(The model doesn't generate a response for the function response.)

User: Is my flight ticket booked?
Model: Yes. Your flight has been booked.

WHEN_IDLE 回應政策

如果使用 WHEN_IDLE 政策，模型只會在沒有進行中的使用者互動時，產生函式回應的回覆。如果使用者正在互動，模型會等待互動完成再生成回覆，以免中斷互動。

User: Please book the 2:00 PM flight to New York for me.

Model: function_call: {name: "book_ticket"}
//(The book_ticket function call is sent to the client and starts running asynchronously in the background.)

User: What is the current time in New York?

//(The book_ticket function finishes, and client sends the result with scheduling: "WHEN_IDLE".)
Function_response: {name: "book_ticket", scheduling: "WHEN_IDLE", response: {booking_status: "booked"}}
//(The ongoing interaction about the time is not interrupted.)

Model: The current time in New York is 12:00pm.
//(After responding to the user's time query, the model issues the response for the book_ticket function.)
Model: Your flight has been booked. Expect a confirmation text on your phone within 5 minutes.

INTERRUPT 回應政策

使用 INTERRUPT 政策時，模型會立即產生函式回應的回覆，並中斷任何進行中的使用者互動。

User: Please book the 2:00 PM flight to New York for me.

Model: function_call: {name: "book_ticket"}
//(The book_ticket function call is sent to the client and starts running asynchronously in the background.)

User: What is the current time in New York?

//(The book_ticket function finishes, and client sends the result with scheduling: "INTERRUPT".)
Function_response: {name: "book_ticket", scheduling: "INTERRUPT", response: {booking_status: "booked"}}
//(The ongoing interaction about the time is interrupted, and model skips responding to it.)

Model: Your flight has been booked. Expect a confirmation text on your phone within 5 minutes.

最佳做法

設計並行作業：一律將緩慢的工具 (例如查詢外部 API 或執行 RAG 管道) 卸載至後端的背景工作。讓模型繼續處理主動音訊串流。
除非必要，否則請避免使用 INTERRUPT：針對重大快訊使用 INTERRUPT。對於背景工作，SILENT 或 WHEN_IDLE 可提供更流暢、更友善的使用者體驗。
獨立的對話回合：在 Gemini Live API 中，工具執行作業完全獨立於對話回合。在工具於背景處理時，對話可以分支、繼續，並自然流暢地進行。
「無聲」注意事項：即使排定為 SILENT，模型有時仍可能會嘗試口頭敘述工具的執行情況。如要強制執行真正的靜音，請在系統指令中加入明確的防護措施 (例如「使用 [工具名稱] 時，請執行無聲執行作業，且不發出任何語音」)，或使用「fire-and-forget」後端模式，完全不將 FunctionResponse 傳回模型。

後續步驟

總覽

使用 Gemini Live API 進行非同步函式呼叫

非同步函式呼叫範例

實作非同步函式呼叫

定義工具

處理訊息串流中的函式呼叫

管理使用者期望

處理重複的函式呼叫

處理非同步函式回應

SILENT 回應政策

WHEN_IDLE 回應政策

INTERRUPT 回應政策

最佳做法

後續步驟

Live API 總覽

即時 API 參考指南

開始及管理直播工作階段

設定 Gemini 功能

使用 Gemini Live API 進行非同步函式呼叫 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

非同步函式呼叫範例

實作非同步函式呼叫

定義工具

處理訊息串流中的函式呼叫

管理使用者期望

處理重複的函式呼叫

處理非同步函式回應

SILENT 回應政策

WHEN_IDLE 回應政策

INTERRUPT 回應政策

最佳做法

後續步驟

Live API 總覽

即時 API 參考指南

開始及管理直播工作階段

設定 Gemini 功能

使用 Gemini Live API 進行非同步函式呼叫