設定 Gemini 功能

本文說明如何在使用 Gemini Live API 時，設定 Gemini 模型的各項功能。您可以設定工具使用情形，例如函式呼叫和建立基準，以及原生音訊功能，例如情感對話和主動式音訊。

設定工具使用方式

多種工具都與各種版本的 Gemini Live API 支援模型相容，包括：

函式呼叫
以 Google 搜尋建立基準
使用 Vertex AI RAG 引擎的依據功能 (預先發布版)

如要啟用特定工具，以便在傳回的回應中使用，請在初始化模型時，將工具名稱納入 tools 清單。以下各節提供範例，說明如何在程式碼中使用各項內建工具。

函式呼叫

如要讓模型與您管理的外部系統或 API 互動，請使用函式呼叫。可用於檢查資料庫、傳送電子郵件或與自訂 API 互動等工作。

模型會生成函式呼叫，應用程式則會執行程式碼，並將結果傳回模型。

所有函式都必須在工作階段開始時宣告，方法是將工具定義做為 LiveConnectConfig 訊息的一部分傳送。

如要啟用函式呼叫功能，請在設定訊息的 tools 清單中加入 function_declarations：

Python

import asyncio

from google import genai
from google.genai.types import (
    Content,
    LiveConnectConfig,
    Part,
)

# Initialize the client.
client = genai.Client(
    vertexai=True,
    project="GOOGLE_CLOUD_PROJECT",  # Replace with your project ID
    location="LOCATION",  # Replace with your location
)

MODEL_ID = "gemini-live-2.5-flash-native-audio"


def get_current_weather(location: str) -> str:
    """Example method. Returns the current weather.

    Args:
        location: The city and state, e.g. San Francisco, CA
    """
    weather_map: dict[str, str] = {
        "Boston, MA": "snowing",
        "San Francisco, CA": "foggy",
        "Seattle, WA": "raining",
        "Austin, TX": "hot",
        "Chicago, IL": "windy",
    }
    return weather_map.get(location, "unknown")


async def main():
    config = LiveConnectConfig(
        response_modalities=["AUDIO"],
        tools=[get_current_weather],
    )

    async with client.aio.live.connect(
        model=MODEL_ID,
        config=config,
    ) as session:
        text_input = "Get the current weather in Boston."
        print(f"Input: {text_input}")

        await session.send_client_content(
            turns=Content(role="user", parts=[Part(text=text_input)])
        )

        async for message in session.receive():
            if message.tool_call:
                function_responses = []
                for function_call in message.tool_call.function_calls:
                    print(f"FunctionCall > {function_call}")
                    # Execute the tool and send the response back to the model.
                    result = get_current_weather(**function_call.args)
                    function_responses.append(
                        {
                            "name": function_call.name,
                            "response": {"result": result},
                            "id": function_call.id,
                        }
                    )
                if function_responses:
                    await session.send_tool_response(function_responses=function_responses)


if __name__ == "__main__":
    asyncio.run(main())

如需在系統指令中使用函式呼叫的範例，請參閱最佳做法範例。

以 Google 搜尋建立基準

如要讓模型根據可驗證的資訊來源，提供更準確且符合事實的回覆，請使用 Google 搜尋建立基準。可用於搜尋網路等工作。

與函式呼叫不同，伺服器端整合會自動處理資訊擷取作業。

如要啟用「透過 Google 搜尋進行基本事實檢查」，請在設定訊息的 tools 清單中加入 google_search：

Python

import asyncio

from google import genai
from google.genai.types import (
    Content,
    LiveConnectConfig,
    Part,
)

# Initialize the client.
client = genai.Client(
    vertexai=True,
    project="GOOGLE_CLOUD_PROJECT",  # Replace with your project ID
    location="LOCATION",  # Replace with your location
)

MODEL_ID = "gemini-live-2.5-flash-native-audio"


async def main():
    config = LiveConnectConfig(
        response_modalities=["AUDIO"],
        tools=[{"google_search": {}}],
    )

    async with client.aio.live.connect(
        model=MODEL_ID,
        config=config,
    ) as session:
        text_input = "What is the current weather in Toronto, Canada?"
        print(f"Input: {text_input}")

        await session.send_client_content(
            turns=Content(role="user", parts=[Part(text=text_input)])
        )

        async for message in session.receive():
            # Consume the messages from the model.
            # In native audio, the model response is in audio format.
            pass


if __name__ == "__main__":
    asyncio.run(main())

利用 Vertex AI RAG 引擎建立基準

您可以搭配使用 Vertex AI RAG 引擎和 Live API，進行基礎化、儲存及擷取內容。可用於從文件語料庫擷取資訊等工作。與「利用 Google 搜尋建立基準」類似，RAG 基準是在伺服器端處理，並自動從您指定的語料庫擷取資訊：

Python

import asyncio

from google import genai
from google.genai.types import (
    Content,
    LiveConnectConfig,
    Part,
    Retrieval,
    Tool,
    VertexRagStore,
    VertexRagStoreRagResource,
)

# Initialize the client.
client = genai.Client(
    vertexai=True,
    project="GOOGLE_CLOUD_PROJECT",  # Replace with your project ID
    location="LOCATION",  # Replace with your location
)

MODEL_ID = "gemini-live-2.5-flash-native-audio"


async def main():
    rag_store = VertexRagStore(
        rag_resources=[
            VertexRagStoreRagResource(
                rag_corpus="RESOURCE_NAME"  # Replace with your corpus resource name
            )
        ],
        # Set `store_context` to true to allow Live API sink context into your memory corpus.
        store_context=True,
    )

    config = LiveConnectConfig(
        response_modalities=["AUDIO"],
        tools=[Tool(retrieval=Retrieval(vertex_rag_store=rag_store))],
    )

    async with client.aio.live.connect(
        model=MODEL_ID,
        config=config,
    ) as session:
        text_input = "YOUR_TEXT_INPUT"
        print(f"Input: {text_input}")

        await session.send_client_content(
            turns=Content(role="user", parts=[Part(text=text_input)])
        )

        async for message in session.receive():
            # Consume the messages from the model.
            # In native audio, the model response is in audio format.
            pass


if __name__ == "__main__":
    asyncio.run(main())

詳情請參閱「在 Gemini Live API 中使用 Vertex AI RAG 引擎」。

設定原生音訊功能

具備原生音訊功能的機型支援下列功能：

設定情緒感知對話

啟用「情感對話」後，模型會嘗試根據使用者的語氣和情緒表達方式，瞭解並做出回應。

如要啟用情緒感知對話，請在設定訊息中將 enable_affective_dialog 設為 true：

Python

config = LiveConnectConfig(
    response_modalities=["AUDIO"],
    enable_affective_dialog=True,
)

設定主動式音訊

透過主動式音訊功能，你可以控制模型回覆的時機。舉例來說，你可以要求 Gemini 只在收到提示或討論特定主題時回覆。如要觀看 Proactive Audio 的影片示範，請參閱「Gemini Live API Native Audio Preview」。

如要啟用「主動式音訊」功能，請在設定訊息中設定 proactivity 欄位，並將 proactive_audio 設為 true：

Python

config = LiveConnectConfig(
    response_modalities=["AUDIO"],
    proactivity=ProactivityConfig(proactive_audio=True),
)

對話範例

以下是與 Gemini 討論烹飪的對話範例：

Prompt: "You are an AI assistant in Italian cooking; only chime in when the topic is about Italian cooking."

Speaker A: "I really love cooking!" (No response from Gemini.)

Speaker B: "Oh yes, me too! My favorite is French cuisine." (No response from
Gemini.)

Speaker A: "I really like Italian food; do you know how to make a pizza?"

(Italian cooking topic will trigger response from Gemini.)
Gemini Live API: "I'd be happy to help! Here's a recipe for a pizza."

常見用途

使用主動式語音功能時，Gemini 的運作方式如下：

延遲時間極短：Gemini 會在使用者說完話後回覆，減少中斷情況，並在發生中斷時保留對話脈絡。
避免中斷：主動式音訊可協助 Gemini 避免受到背景噪音或外部對話干擾，並防止 Gemini 在對話期間受到外部對話干擾而做出回應。
處理中斷：如果使用者需要在 Gemini 回覆時中斷，Proactive Audio 可讓 Gemini 更輕鬆地適當回覆 (也就是處理適當的中斷)，而不是像使用者使用「嗯」或「呃」等填充詞時。
共同聆聽音訊：Gemini 可以共同聆聽音訊檔 (非說話者的聲音)，並在對話中回答與該音訊檔相關的問題。

帳單

Gemini 聆聽對話時，系統會收取音訊權杖費用。

如果是輸出音訊權杖，只有在 Gemini 回覆時才會收費。如果 Gemini 沒有回應或保持靜音，系統不會收取輸出音訊權杖的費用。

詳情請參閱「Vertex AI 定價」。

後續步驟

如要進一步瞭解如何使用 Gemini Live API，請參閱：

設定 Gemini 功能 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

設定工具使用方式

函式呼叫

Python

以 Google 搜尋建立基準

Python

利用 Vertex AI RAG 引擎建立基準

Python

設定原生音訊功能

設定情緒感知對話

Python

設定主動式音訊

Python

常見用途

帳單

後續步驟

設定 Gemini 功能