This page describes the format required to upload golden evaluations in a CSV file. For details about golden evaluations, see the golden evaluations documentation.
Download the template
- Navigate to the Evaluate tab and click + Add test case -> Golden.
- In the menu that appears, click Download template.
- After you have used the template to create a CSV file containing your golden evaluations, you can upload it by clicking Upload file in the same menu.
General structure
- A single CSV file can contain multiple evaluations. Each evaluation can span multiple rows.
- The first row of an evaluation is the evaluation row and defines its overall properties (name and metadata).
- Each subsequent row is a conversation row and defines a single conversation turn in the evaluation (for example, an end-user says something, the agent is expected to reply, or a tool call is expected).
- You can start a new test case by providing a new name in the
display_namecolumn. Each newdisplay_namevalue defines the start of a new evaluation.
Header row
Your CSV file must have a header row as the first line. This header defines
the a data variable in each column. All variables other than the required
variables are optional, unless required by an
action_type value.
Optional variable columns can be in any order after the required variables.
- Required variables:
display_name,turn_index,action_type.
Define a conversation evaluation
Each new evaluation starts at an evaluation row. Each conversation row below the evaluation row corresponds to one conversation turn, until the next evaluation row.
Evaluation row
The first line after the header row must be an evaluation row. Each evaluation row defines a new evaluation.
- Required: Enter a unique, human-readable name for the evaluation in the
display_namefield. - Optional: You can optionally add any metadata variable data in this row.
Conversation row
Each row corresponds to data from one conversation turn.
- Required: Enter values in the
turn_indexandaction_typefields.display_namemust be left blank. - Optional: Enter values for any header columns other than
metadata variables
or
display_name.
Variables
The following tables describe the available data variables. All variables other
than the required variables are optional, unless required by an action_type
value. All variables must be defined in the header row, one per column.
Optional variable columns can be in any order after the required columns.
Required header variables
| Column name | Description |
|---|---|
display_name |
The human-readable name of your evaluation. This is only filled in for the first row of a new evaluation. Each new display_name value defines a new evaluation. |
turn_index |
A number (1, 2, 3...) indicating the sequential order of the conversation turn. All rows in one turn share the index value. Values must start at 1 for each evaluation. Each subsequent row must have the same or greater value than the row preceding it. |
action_type |
Specifies what this row's data represents. Each value has optional variable values that must be also filled (as indicated) in order for the conversation turn to input correctly. Input value must be one of the following:INPUT_TEXT: An end-user text input.- (Required) text_content.INPUT_IMAGE: An end-user image input.- (Required) image_mime_type, image_content.INPUT_TOOL_RESPONSE: A tool response input.- (Required) tool_name.- (Optional) tool_response_json.INPUT_UPDATED_VARIABLES: Update variables from an input.- (Required) updated_variables_jsonEXPECTATION_TEXT: Expected output from an agent text response.- (Required) response_agent, text_content.- (Optional) expectation_note.EXPECTATION_TOOL_CALL: Expected tool call.- (Required) tool_name.- (Optional) tool_call_args_json, expectation_note.EXPECTATION_TOOL_RESPONSE: Expected tool response.- (Required) tool_name.- (Optional) expectation_note.EXPECTATION_AGENT_TRANSFER: Expected agent transfer.- (Required) agent_transfer_target.- (Optional) expectation_note. |
Metadata variables
| Column name | Description |
|---|---|
evaluation_id |
A unique ID for the evaluation. Each evaluation_id value must be unique to your Customer Experience Agent Studio agent. If no value is entered manually in this column, a unique ID will be generated automatically. |
description |
Free-text notes or a description of the evaluation's purpose. |
tags |
Semicolon-separated tags for organizing evaluations (for example, "tag1;tag2"). |
evaluation_groups |
Semicolon-separated names of any evaluation groups that the evaluation belongs to (for example, "group name 1;group name 2"). Any evaluation_groups values entered in this column but not defined in the header will be ignored. |
Conversation turn variables
| Column name | Description |
|---|---|
response_agent |
Name of the agent that provided the response. Expected only for EXPECTATION_TEXT. |
text_content |
The text for INPUT_TEXT or EXPECTATION_TEXT. |
image_mime_type |
The IANA standard MIME type of the source image. Supported values: image/png, image/jpeg, image/webp, image/heic, image/heif. |
image_content |
Bytes string of the INPUT_IMAGE. |
tool_name |
The display_name for the tool being called or responding. Expected for INPUT_TOOL_RESPONSE,EXPECTATION_TOOL_CALL or EXPECTATION_TOOL_RESPONSE. |
tool_call_args_json |
The JSON arguments for an EXPECTATION_TOOL_CALL. |
tool_response_json |
The JSON content of an INPUT_TOOL_RESPONSE. |
updated_variables_json |
The JSON content for INPUT_UPDATED_VARIABLES. |
agent_transfer_target |
Display name of the target agent for an EXPECTATION_AGENT_TRANSFER. |
expectation_note |
Note or description of the expectation. |