Agents

The term agent can mean different things. In a Multi-Agent System, which forms a hierarchical agent tree, the entire agent tree is referred to as the agent application.

An agent application is composed of one or more agents, where each agent can be either the root agent or a sub-agent.

A root agent (also known as steering agent) acts as the primary entry point and orchestrator for the overall agent application. It typically handles the main interaction with the end-user, it is responsible for understanding the overall goals, and it delegates specific tasks to appropriate sub-agents.

A sub-agent (also known as child agent) is a more specialized agent designed to handle a specific task, domain, or capability. For example, a sub-agent could be tasked with searching a specific database or analyzing a particular type of data. Sub-agents promote modularity and reusability in your agent application.

Root agents can invoke sub-agents, and sub-agents can invoke other sub-agents.

Route and sub-agent diagram

Language support

You should design your agents using English, but agents can automatically detect the language of end-user input, and they will automatically respond using the same language. For the list of supported languages, see the languages reference.

Create an agent application and root agent

To create an agent application and root agent:

  1. Open the Gemini Enterprise for CX console.
  2. Select your project.
  3. Click Create or Create agent.
  4. Provide an agent application name.
  5. Click Create. If this is the first agent application you have created for the project, creation may take 1-2 minutes. The agent builder is shown and, a root agent is created for you.
  6. Click the plus sign in the top right corner of the root agent.
  7. Click Add instructions to add instructions for the root agent.
  8. Click Add tool to add tools for the root agent.

Create a sub-agent

To create a sub-agent:

  1. Click the plus sign at the bottom of the root agent.
  2. Click Add sub-agent.

Manage agent applications

To manage agent applications for your project:

  1. Open the Gemini Enterprise for CX console.
  2. Select your project. The list of agent applications for your project are shown.

For each agent application, the following information and actions are available:

  • Click the agent application name to open the application in the agent builder.
  • Deployed to column shows the number of channels the application is deployed to.
  • Sessions column shows the number of sessions in the past 24 hours that use a deployment channel.
  • Escalation column shows the number of escalations in the past 24 hours that use a deployment channel.
  • The latest update time for the agent application is shown.
  • You can click the context menu for a particular agent application, then Import agent, Export agent, or Delete agent. For more information, see Export and import.

Agent application settings

To edit global agent application settings:

  1. Click the settings icon on the right side of the builder.

The following agent application settings are available:

  • Basic:
    • Interactions:
      • Global model: Default model used unless overridden by individual agents. Note that some models may be optimized for text or voice.
      • Language controls:
        • Default language: Start all conversations in this language.
        • Additional languages: If your agent application is multilingual, provide additional languages. Your agent application will automatically switch languages to match the user input
        • Unsupported language handling: When user input is provided in an unsupported language, the agent application will ask the user to repeat the input once. If the new input is also using an unsupported language, select the action that should take place.
    • Behavior:
      • Voice: The voice used for speech synthesis.
      • Ambient sounds: Background sounds played by the agent.
      • Response length: Adjust how verbose the agent is.
      • Allow user interruptions: Allow the end-user to interrupt the agent.
      • Adapt when interrupted: When enabled, agents will try to adapt their response considering that the user may not have heard everything.
    • Agent details:
      • Display name: Display name for the agent application.
      • Lock agent: Prevent changes from being applied.
      • Notes: Human-readable description of the agent application. This is not sent to the model.
  • Advanced:
    • Speech:
      • No user input time out: Wait for user input, then prompt them to re-engage.
      • Ambient sound volume gain: Adjust the ambient sound volume.
      • Keypad input: Set up dual-tone multi-frequency (DTMF) for telephone calls.
    • Logging:
      • Enable conversation logging: Automatically log conversations and tracing data.
      • Enable redaction: Automatically find and remove sensitive data.
      • Enable Cloud Logging: Automatically stream logs to Cloud Logging.
      • Export logs to BigQuery: Export logs to BigQuery for custom analysis.
      • Audio recording: Output Cloud Storage bucket location for audio files.
    • Tools:
      • Execution mode: Execute tool calls in parallel or sequential order.
    • Global instruction: Instructions for all agents in the agent application. You can use these instructions to set up a stable identity or personality across agents.

Agent settings

To edit root or sub-agent specific settings:

  1. Click the context menu in the agent title box.
  2. Select Edit config.

The following root and sub-agent settings are available:

  • Agent name: Display name for the agent. Use snake case.
  • Model: The model used for the agent.
  • Description: A description of the agent. This description is provided to other agents in the agent application.
  • Custom code: Provide code for callbacks.