設定 Gemini 功能

本文說明如何在使用 Gemini Live API 時，設定 Gemini 模型的各項功能。您可以設定工具使用情形，例如函式呼叫和建立基準，以及原生音訊功能，例如情感對話和主動式音訊。

設定工具使用方式

多種工具都與各種版本的 Gemini Live API 支援模型相容，包括：

函式呼叫
以 Google 搜尋建立基準
使用 Vertex AI RAG 引擎的依據功能 (預先發布版)

如要啟用特定工具，以便在傳回的回應中使用，請在初始化模型時，將工具名稱納入 tools 清單。以下各節提供範例，說明如何在程式碼中使用各項內建工具。

函式呼叫

使用函式呼叫功能建立函式說明，然後透過要求將說明傳送給模型。模型的回應會提供與說明相符的函式名稱，以及用來呼叫這個函式的引數。

所有函式都必須在工作階段開始時宣告，方法是在 LiveConnectConfig 訊息中傳送工具定義。

如要啟用函式呼叫功能，請在設定訊息的 tools 清單中加入 function_declarations：

Python

import asyncio
from google import genai
from google.genai import types

client = genai.Client(
    vertexai=True,
    project=GOOGLE_CLOUD_PROJECT,
    location=GOOGLE_CLOUD_LOCATION,
)
model = "gemini-live-2.5-flash"

# Simple function definitions
turn_on_the_lights = {"name": "turn_on_the_lights"}
turn_off_the_lights = {"name": "turn_off_the_lights"}

tools = [{"function_declarations": [turn_on_the_lights, turn_off_the_lights]}]
config = {"response_modalities": ["TEXT"], "tools": tools}

async def main():
    async with client.aio.live.connect(model=model, config=config) as session:
        prompt = "Turn on the lights please"
        await session.send_client_content(turns={"parts": [{"text": prompt}]})

        async for chunk in session.receive():
            if chunk.server_content:
                if chunk.text is not None:
                    print(chunk.text)
            elif chunk.tool_call:
                function_responses = []
                for fc in tool_call.function_calls:
                    function_response = types.FunctionResponse(
                        name=fc.name,
                        response={ "result": "ok" } # simple, hard-coded function response
                    )
                    function_responses.append(function_response)

                await session.send_tool_response(function_responses=function_responses)


if __name__ == "__main__":
    asyncio.run(main())

如需在系統指令中使用函式呼叫的範例，請參閱最佳做法範例。

以 Google 搜尋建立基準

如要搭配 Gemini Live API 使用以 Google 搜尋為基礎功能，請在設定訊息的 tools 清單中加入 google_search：

Python

import asyncio
from google import genai
from google.genai import types

client = genai.Client(
    vertexai=True,
    project=GOOGLE_CLOUD_PROJECT,
    location=GOOGLE_CLOUD_LOCATION,
)
model = "gemini-live-2.5-flash"


tools = [{'google_search': {}}]
config = {"response_modalities": ["TEXT"], "tools": tools}

async def main():
    async with client.aio.live.connect(model=model, config=config) as session:
        prompt = "When did the last Brazil vs. Argentina soccer match happen?"
        await session.send_client_content(turns={"parts": [{"text": prompt}]})

        async for chunk in session.receive():
            if chunk.server_content:
                if chunk.text is not None:
                    print(chunk.text)

                # The model might generate and execute Python code to use Search
                model_turn = chunk.server_content.model_turn
                if model_turn:
                    for part in model_turn.parts:
                        if part.executable_code is not None:
                        print(part.executable_code.code)

                        if part.code_execution_result is not None:
                        print(part.code_execution_result.output)

if __name__ == "__main__":
    asyncio.run(main())

利用 Vertex AI RAG 引擎建立基準

您可以搭配 Live API 使用 Vertex AI RAG 引擎，根據內容、儲存及擷取脈絡：

Python

from google import genai
from google.genai import types
from google.genai.types import (Content, LiveConnectConfig, HttpOptions, Modality, Part)
from IPython import display

PROJECT_ID=YOUR_PROJECT_ID
LOCATION=YOUR_LOCATION
TEXT_INPUT=YOUR_TEXT_INPUT
MODEL_NAME="gemini-live-2.5-flash"

client = genai.Client(
   vertexai=True,
   project=PROJECT_ID,
   location=LOCATION,
)

rag_store=types.VertexRagStore(
   rag_resources=[
       types.VertexRagStoreRagResource(
           rag_corpus=  # Use memory corpus if you want to store context.
       )
   ],
   # Set `store_context` to true to allow Live API sink context into your memory corpus.
   store_context=True
)

async with client.aio.live.connect(
   model=MODEL_NAME,
   config=LiveConnectConfig(response_modalities=[Modality.TEXT],
                            tools=[types.Tool(
                                retrieval=types.Retrieval(
                                    vertex_rag_store=rag_store))]),
) as session:
   text_input=TEXT_INPUT
   print("> ", text_input, "\n")
   await session.send_client_content(
       turns=Content(role="user", parts=[Part(text=text_input)])
   )

   async for message in session.receive():
       if message.text:
           display.display(display.Markdown(message.text))
           continue

詳情請參閱「在 Gemini Live API 中使用 Vertex AI RAG 引擎」。

設定原生音訊功能

具備原生音訊功能的機型支援下列功能：

設定情緒感知對話

啟用「情感對話」後，模型會嘗試根據使用者的語氣和情緒表達方式，瞭解並做出回應。

如要啟用情緒感知對話，請在設定訊息中將 enable_affective_dialog 設為 true：

Python

config = LiveConnectConfig(
    response_modalities=["AUDIO"],
    enable_affective_dialog=True,
)

設定主動式音訊

透過主動式音訊功能，你可以控制模型回覆的時機。舉例來說，你可以要求 Gemini 只在收到提示或討論特定主題時回覆。如要觀看 Proactive Audio 的影片示範，請參閱「Gemini Live API Native Audio Preview」。

如要啟用「主動式音訊」功能，請在設定訊息中設定 proactivity 欄位，並將 proactive_audio 設為 true：

Python

config = LiveConnectConfig(
    response_modalities=["AUDIO"],
    proactivity=ProactivityConfig(proactive_audio=True),
)

對話範例

以下是與 Gemini 討論烹飪的對話範例：

Prompt: "You are an AI assistant in Italian cooking; only chime in when the topic is about Italian cooking."

Speaker A: "I really love cooking!" (No response from Gemini.)

Speaker B: "Oh yes, me too! My favorite is French cuisine." (No response from
Gemini.)

Speaker A: "I really like Italian food; do you know how to make a pizza?"

(Italian cooking topic will trigger response from Gemini.)
Gemini Live API: "I'd be happy to help! Here's a recipe for a pizza."

常見用途

使用主動式語音功能時，Gemini 的運作方式如下：

延遲時間極短：Gemini 會在使用者說完話後回覆，減少中斷情況，並在發生中斷時保留對話脈絡。
避免中斷：主動式音訊可協助 Gemini 避免受到背景噪音或外部對話干擾，並防止 Gemini 在對話期間受到外部對話干擾而做出回應。
處理中斷：如果使用者需要在 Gemini 回覆時中斷，Proactive Audio 可讓 Gemini 更輕鬆地適當回覆 (也就是處理適當的中斷)，而不是像使用者使用「嗯」或「呃」等填充詞時。
共同聆聽音訊：Gemini 可以共同聆聽音訊檔 (非說話者的聲音)，並在對話中回答與該音訊檔相關的問題。

帳單

Gemini 聆聽對話時，系統會收取音訊權杖費用。

如果是輸出音訊符記，系統只會在 Gemini 回覆時收費。如果 Gemini 沒有回應或保持靜音，系統不會收取輸出音訊權杖的費用。

詳情請參閱「Vertex AI 定價」。

後續步驟

如要進一步瞭解如何使用 Gemini Live API，請參閱：

設定 Gemini 功能 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

設定工具使用方式

函式呼叫

Python

以 Google 搜尋建立基準

Python

利用 Vertex AI RAG 引擎建立基準

Python

設定原生音訊功能

設定情緒感知對話

Python

設定主動式音訊

Python

常見用途

帳單

後續步驟

設定 Gemini 功能