- Resource: EvaluationRun
- EvaluationRun.EvaluationType
- EvaluationRun.EvaluationRunState
- EvaluationRun.Progress
- EvaluationRun.EvaluationRunSummary
- PersonaRunConfig
- OptimizationConfig
- OptimizationConfig.OptimizationStatus
- Methods
Resource: EvaluationRun
An evaluation run represents an all the evaluation results from an evaluation execution.
| JSON representation |
|---|
{ "name": string, "displayName": string, "evaluationResults": [ string ], "createTime": string, "initiatedBy": string, "appVersion": string, "appVersionDisplayName": string, "changelog": string, "evaluations": [ string ], "evaluationDataset": string, "evaluationType": enum ( |
| Fields | |
|---|---|
name |
Identifier. The unique identifier of the evaluation run. Format: |
displayName |
Optional. User-defined display name of the evaluation run. default: " |
evaluationResults[] |
Output only. The evaluation results that are part of this run. Format: |
createTime |
Output only. Timestamp when the evaluation run was created. Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: |
initiatedBy |
Output only. The user who initiated the evaluation run. |
appVersion |
Output only. The app version to evaluate. Format: |
appVersionDisplayName |
Output only. The display name of the |
changelog |
Output only. The changelog of the app version that the evaluation ran against. This is populated if user runs evaluation on latest/draft. |
evaluations[] |
Output only. The evaluations that are part of this run. The list must contain evaluations of the same type, either all golden or all scenario. This field is mutually exclusive with |
evaluationDataset |
Output only. The evaluation dataset that this run is associated with. This field is mutually exclusive with |
evaluationType |
Output only. The type of the evaluations in this run. |
state |
Output only. The state of the evaluation run. |
progress |
Output only. The progress of the evaluation run. |
config |
Output only. The configuration used in the run. |
error |
Output only. Deprecated: Use errorInfo instead. Errors encountered during execution. |
errorInfo |
Output only. Error information for the evaluation run. |
evaluationRunSummaries |
Output only. Map of evaluation name to EvaluationRunSummary. An object containing a list of |
runCount |
Output only. The number of times the evaluations inside the run were run. |
personaRunConfigs[] |
Output only. The configuration to use for the run per persona. |
optimizationConfig |
Optional. Configuration for running the optimization step after the evaluation run. If not set, the optimization step will not be run. |
scheduledEvaluationRun |
Output only. The scheduled evaluation run resource name that created this evaluation run. This field is only set if the evaluation run was created by a scheduled evaluation run. Format: |
goldenRunMethod |
Output only. The method used to run the evaluation. |
EvaluationRun.EvaluationType
The type of the evaluations in this run.
| Enums | |
|---|---|
EVALUATION_TYPE_UNSPECIFIED |
Evaluation type is not specified. |
GOLDEN |
Golden evaluation. |
SCENARIO |
Scenario evaluation. |
EvaluationRun.EvaluationRunState
The state of the evaluation run.
| Enums | |
|---|---|
EVALUATION_RUN_STATE_UNSPECIFIED |
Evaluation run state is not specified. |
RUNNING |
Evaluation run is running. |
COMPLETED |
Evaluation run has completed. |
ERROR |
The evaluation run has an error. |
EvaluationRun.Progress
The progress of the evaluation run.
| JSON representation |
|---|
{ "totalCount": integer, "failedCount": integer, "errorCount": integer, "completedCount": integer, "passedCount": integer } |
| Fields | |
|---|---|
totalCount |
Output only. Total number of evaluation results in this run. |
failedCount |
Output only. Number of completed evaluation results with an outcome of FAIL. (EvaluationResult.execution_state is COMPLETED and EvaluationResult.evaluation_status is FAIL). |
errorCount |
Output only. Number of evaluation results that failed to execute. (EvaluationResult.execution_state is ERROR). |
completedCount |
Output only. Number of evaluation results that finished successfully. (EvaluationResult.execution_state is COMPLETED). |
passedCount |
Output only. Number of completed evaluation results with an outcome of PASS. (EvaluationResult.execution_state is COMPLETED and EvaluationResult.evaluation_status is PASS). |
EvaluationRun.EvaluationRunSummary
Contains the summary of passed and failed result counts for a specific evaluation in an evaluation run.
| JSON representation |
|---|
{ "passedCount": integer, "failedCount": integer, "errorCount": integer } |
| Fields | |
|---|---|
passedCount |
Output only. Number of passed results for the associated Evaluation in this run. |
failedCount |
Output only. Number of failed results for the associated Evaluation in this run. |
errorCount |
Output only. Number of error results for the associated Evaluation in this run. |
PersonaRunConfig
Configuration for running an evaluation for a specific persona.
| JSON representation |
|---|
{ "persona": string, "taskCount": integer } |
| Fields | |
|---|---|
persona |
Optional. The persona to use for the evaluation. Format: |
taskCount |
Optional. The number of tasks to run for the persona. |
OptimizationConfig
Configuration for running the optimization step after the evaluation run.
| JSON representation |
|---|
{
"generateLossReport": boolean,
"assistantSessionName": string,
"reportSummary": string,
"shouldSuggestFix": boolean,
"status": enum ( |
| Fields | |
|---|---|
generateLossReport |
Optional. Whether to generate a loss report. |
assistantSessionName |
Output only. The assistant session to use for the optimization based on this evaluation run. Format: |
reportSummary |
Output only. The summary of the loss report. |
shouldSuggestFix |
Output only. Whether to suggest a fix for the losses. |
status |
Output only. The status of the optimization run. |
errorMessage |
Output only. The error message if the optimization run failed. |
OptimizationConfig.OptimizationStatus
The status of the optimization run.
| Enums | |
|---|---|
OPTIMIZATION_STATUS_UNSPECIFIED |
Optimization status is not specified. |
RUNNING |
Optimization is running. |
COMPLETED |
Optimization has completed. |
ERROR |
Optimization failed due to an internal error. |
Methods |
|
|---|---|
|
Deletes an evaluation run. |
|
Gets details of the specified evaluation run. |
|
Lists all evaluation runs in the given app. |