Traffic splitting (A/B testing)

Traffic splitting lets you divide user conversations across different versions of your CX Agent Studio agent application within a single deployment channel. You can use traffic splitting to conduct A/B testing or safely deploy updates in stages by routing a specified percentage of traffic to each version.

This guide describes how to configure traffic splitting using the CX Agent Studio console or the CX Agent Studio REST API, simulate test traffic, and analyze version performance using BigQuery logs.

Workflow overview

  1. Create application versions: Create immutable snapshots (versions) of your agent application to compare.
  2. Configure traffic splitting on a deployment channel: Assign traffic percentages across multiple application versions using the console or REST API.
  3. Send test traffic: Interactively test your agent application in the console or initiate conversation turns using the runSession API.
  4. Analyze performance in BigQuery: Query exported conversation logs in BigQuery to evaluate metrics across versions.

Configure traffic splitting

You can configure traffic allocations across different application versions using either the console or the REST API. Each allocation must point to an existing agent application version, and the sum of all traffic percentages in the deployment must equal 100.

Using the console

To configure traffic splitting in the agent builder console:

  1. Open the CX Agent Studio console and select your agent application.
  2. Click Deploy tab at the top of the page.
  3. Select an existing deployment channel or click New channel to create one (for example, API access).
  4. Under Agent version, click Add version and select the agent application versions you want to include in the experiment.
  5. Enter the Traffic percentage for each version (for example, 90% for Version A and 10% for Version B). Ensure the percentages total 100%.
  6. Click Create channel to apply the configuration.

Using the REST API

You can programmatically set up traffic splitting by calling the patch method on your deployment resource and defining experimentConfig.versionRelease.trafficAllocations.

Before calling the API, ensure you have the following identifiers:

  • PROJECT_ID: Your Google Cloud project ID.
  • LOCATION_ID: The region of your agent application (for example, us-east1).
  • APP_ID: The ID of your agent application.
  • DEPLOYMENT_ID: The ID of the deployment channel.
  • VERSION_A_UUID / VERSION_B_UUID: The unique identifiers of your application versions.

Send a PATCH request to update the deployment configuration:

curl -X PATCH \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  "https://ces.googleapis.com/v1beta/projects/PROJECT_ID/locations/LOCATION_ID/apps/APP_ID/deployments/DEPLOYMENT_ID?updateMask=experimentConfig" \
  -d '{
    "experimentConfig": {
      "versionRelease": {
        "trafficAllocations": [
          {
            "appVersion": "projects/PROJECT_ID/locations/LOCATION_ID/apps/APP_ID/versions/VERSION_A_UUID",
            "trafficPercentage": 90
          },
          {
            "appVersion": "projects/PROJECT_ID/locations/LOCATION_ID/apps/APP_ID/versions/VERSION_B_UUID",
            "trafficPercentage": 10
          }
        ]
      }
    }
  }'

Verify deployment configuration

To verify that your traffic allocations are active:

Using the console

  1. Open the Deploy page in the CX Agent Studio console.
  2. Locate your channel in the deployments list.
  3. Verify that the Version column displays the configured split (for example, Version A (90%), Version B (10%)).

Using the REST API

Send a GET request to retrieve the deployment resource details:

curl -X GET \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  "https://ces.googleapis.com/v1beta/projects/PROJECT_ID/locations/LOCATION_ID/apps/APP_ID/deployments/DEPLOYMENT_ID"

The response JSON includes the active experimentConfig confirming the traffic percentages assigned to each version and displaying "state": "RUNNING" inside versionRelease.

Send test traffic

When traffic splitting is active on a deployment channel, incoming user sessions are randomized and routed to the configured versions according to their assigned percentage weights.

To simulate user traffic programmatically, call the runSession method. To ensure that randomized traffic routing takes effect across multiple interactions, generate a unique SESSION_ID for each individual test session.

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  "https://ces.googleapis.com/v1beta/projects/PROJECT_ID/locations/LOCATION_ID/apps/APP_ID/sessions/SESSION_ID:runSession" \
  -d '{
    "config": {
      "deployment": "projects/PROJECT_ID/locations/LOCATION_ID/apps/APP_ID/deployments/DEPLOYMENT_ID"
    },
    "inputs": [
      {
        "text": "what is your account balance as of today?"
      }
    ]
  }'

Analyze version performance in BigQuery

If you have enabled BigQuery export for conversation logs, interaction records—including the specific app_version_id handled by each turn—are recorded in your BigQuery dataset.

You can run SQL queries to evaluate and compare the responses generated by each version during your traffic split window.

  1. In the Google Cloud console, go to BigQuery.
  2. Run the following query against your interaction logs table, substituting your project details and evaluation window (START_TIME and END_TIME):
SELECT
  app_version_id,
  tool_call.name AS tool_name,
  tool_call.output AS tool_result_message,
  COUNT(1) AS count
FROM
  `PROJECT_ID.conversational_agents_logs.v1beta_logs`,
  UNNEST(json_payload.query_result.generations) AS generation,
  UNNEST(generation.tool_calls) AS tool_call
WHERE
  json_payload.resource = 'projects/PROJECT_ID/locations/LOCATION_ID/apps/APP_ID'
  AND timestamp >= TIMESTAMP('START_TIME_YYYY-MM-DD HH:MM:SS', 'TIMEZONE')
  AND timestamp <= TIMESTAMP('END_TIME_YYYY-MM-DD HH:MM:SS', 'TIMEZONE')
GROUP BY
  app_version_id, tool_name, tool_result_message
ORDER BY
  app_version_id, count DESC
  1. Evaluate metrics:
    • Identify versions: Map the output rows to your specific version UUIDs or tool executions (for example, comparing responses from Version A versus Version B).
    • Calculate success rate: Compare the ratio of outcomes against errors or outdated responses across the two versions to decide whether to promote the candidate version to 100% of traffic.