MCP Tools Reference: ces.googleapis.com

Tool: list_evaluation_datasets

Lists evaluation datasets.

The following sample demonstrate how to use curl to invoke the list_evaluation_datasets MCP tool.

Curl Request
                  
curl --location 'https://ces.[REGION].rep.googleapis.com/mcp' \
--header 'content-type: application/json' \
--header 'accept: application/json, text/event-stream' \
--data '{
  "method": "tools/call",
  "params": {
    "name": "list_evaluation_datasets",
    "arguments": {
      // provide these details according to the tool's MCP specification
    }
  },
  "jsonrpc": "2.0",
  "id": 1
}'
                

Input Schema

Request message for EvaluationService.ListEvaluationDatasets.

ListEvaluationDatasetsRequest

JSON representation
{
  "parent": string,
  "pageSize": integer,
  "pageToken": string,
  "filter": string,
  "orderBy": string
}
Fields
parent

string

Required. The resource name of the app to list evaluation datasets from.

pageSize

integer

Optional. Requested page size. Server may return fewer items than requested. If unspecified, server will pick an appropriate default.

pageToken

string

Optional. The next_page_token value returned from a previous list EvaluationService.ListEvaluationDatasets call.

filter

string

Optional. Filter to be applied when listing the evaluation datasets. See https://google.aip.dev/160 for more details.

orderBy

string

Optional. Field to sort by. Only "name" and "create_time", and "update_time" are supported. Time fields are ordered in descending order, and the name field is ordered in ascending order. If not included, "update_time" will be the default. See https://google.aip.dev/132#ordering for more details.

Output Schema

Response message for EvaluationService.ListEvaluationDatasets.

ListEvaluationDatasetsResponse

JSON representation
{
  "evaluationDatasets": [
    {
      object (EvaluationDataset)
    }
  ],
  "nextPageToken": string
}
Fields
evaluationDatasets[]

object (EvaluationDataset)

The list of evaluation datasets.

nextPageToken

string

A token that can be sent as ListEvaluationDatasetsRequest.page_token to retrieve the next page. Absence of this field indicates there are no subsequent pages.

EvaluationDataset

JSON representation
{
  "name": string,
  "displayName": string,
  "evaluations": [
    string
  ],
  "createTime": string,
  "updateTime": string,
  "etag": string,
  "createdBy": string,
  "lastUpdatedBy": string,
  "aggregatedMetrics": {
    object (AggregatedMetrics)
  }
}
Fields
name

string

Identifier. The unique identifier of this evaluation dataset. Format: projects/{project}/locations/{location}/apps/{app}/evaluationDatasets/{evaluationDataset}

displayName

string

Required. User-defined display name of the evaluation dataset. Unique within an App.

evaluations[]

string

Optional. Evaluations that are included in this dataset.

createTime

string (Timestamp format)

Output only. Timestamp when the evaluation dataset was created.

Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".

updateTime

string (Timestamp format)

Output only. Timestamp when the evaluation dataset was last updated.

Uses RFC 3339, where generated output will always be Z-normalized and use 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z", "2014-10-02T15:01:23.045123456Z" or "2014-10-02T15:01:23+05:30".

etag

string

Output only. Etag used to ensure the object hasn't changed during a read-modify-write operation. If the etag is empty, the update will overwrite any concurrent changes.

createdBy

string

Output only. The user who created the evaluation dataset.

lastUpdatedBy

string

Output only. The user who last updated the evaluation dataset.

aggregatedMetrics

object (AggregatedMetrics)

Output only. The aggregated metrics for this evaluation dataset across all runs.

Timestamp

JSON representation
{
  "seconds": string,
  "nanos": integer
}
Fields
seconds

string (int64 format)

Represents seconds of UTC time since Unix epoch 1970-01-01T00:00:00Z. Must be between -62135596800 and 253402300799 inclusive (which corresponds to 0001-01-01T00:00:00Z to 9999-12-31T23:59:59Z).

nanos

integer

Non-negative fractions of a second at nanosecond resolution. This field is the nanosecond portion of the duration, not an alternative to seconds. Negative second values with fractions must still have non-negative nanos values that count forward in time. Must be between 0 and 999,999,999 inclusive.

AggregatedMetrics

JSON representation
{
  "metricsByAppVersion": [
    {
      object (MetricsByAppVersion)
    }
  ]
}
Fields
metricsByAppVersion[]

object (MetricsByAppVersion)

Output only. Aggregated metrics, grouped by app version ID.

MetricsByAppVersion

JSON representation
{
  "appVersionId": string,
  "toolMetrics": [
    {
      object (ToolMetrics)
    }
  ],
  "semanticSimilarityMetrics": [
    {
      object (SemanticSimilarityMetrics)
    }
  ],
  "hallucinationMetrics": [
    {
      object (HallucinationMetrics)
    }
  ],
  "toolCallLatencyMetrics": [
    {
      object (ToolCallLatencyMetrics)
    }
  ],
  "turnLatencyMetrics": [
    {
      object (TurnLatencyMetrics)
    }
  ],
  "passCount": integer,
  "failCount": integer,
  "metricsByTurn": [
    {
      object (MetricsByTurn)
    }
  ]
}
Fields
appVersionId

string

Output only. The app version ID.

toolMetrics[]

object (ToolMetrics)

Output only. Metrics for each tool within this app version.

semanticSimilarityMetrics[]

object (SemanticSimilarityMetrics)

Output only. Metrics for semantic similarity within this app version.

hallucinationMetrics[]

object (HallucinationMetrics)

Output only. Metrics for hallucination within this app version.

toolCallLatencyMetrics[]

object (ToolCallLatencyMetrics)

Output only. Metrics for tool call latency within this app version.

turnLatencyMetrics[]

object (TurnLatencyMetrics)

Output only. Metrics for turn latency within this app version.

passCount

integer

Output only. The number of times the evaluation passed.

failCount

integer

Output only. The number of times the evaluation failed.

metricsByTurn[]

object (MetricsByTurn)

Output only. Metrics aggregated per turn within this app version.

ToolMetrics

JSON representation
{
  "tool": string,
  "passCount": integer,
  "failCount": integer
}
Fields
tool

string

Output only. The name of the tool.

passCount

integer

Output only. The number of times the tool passed.

failCount

integer

Output only. The number of times the tool failed.

SemanticSimilarityMetrics

JSON representation
{
  "score": number
}
Fields
score

number

Output only. The average semantic similarity score (0-4).

HallucinationMetrics

JSON representation
{
  "score": number
}
Fields
score

number

Output only. The average hallucination score (0 to 1).

ToolCallLatencyMetrics

JSON representation
{
  "tool": string,
  "averageLatency": string
}
Fields
tool

string

Output only. The name of the tool.

averageLatency

string (Duration format)

Output only. The average latency of the tool calls.

A duration in seconds with up to nine fractional digits, ending with 's'. Example: "3.5s".

Duration

JSON representation
{
  "seconds": string,
  "nanos": integer
}
Fields
seconds

string (int64 format)

Signed seconds of the span of time. Must be from -315,576,000,000 to +315,576,000,000 inclusive. Note: these bounds are computed from: 60 sec/min * 60 min/hr * 24 hr/day * 365.25 days/year * 10000 years

nanos

integer

Signed fractions of a second at nanosecond resolution of the span of time. Durations less than one second are represented with a 0 seconds field and a positive or negative nanos field. For durations of one second or more, a non-zero value for the nanos field must be of the same sign as the seconds field. Must be from -999,999,999 to +999,999,999 inclusive.

TurnLatencyMetrics

JSON representation
{
  "averageLatency": string
}
Fields
averageLatency

string (Duration format)

Output only. The average latency of the turns.

A duration in seconds with up to nine fractional digits, ending with 's'. Example: "3.5s".

MetricsByTurn

JSON representation
{
  "turnIndex": integer,
  "toolMetrics": [
    {
      object (ToolMetrics)
    }
  ],
  "semanticSimilarityMetrics": [
    {
      object (SemanticSimilarityMetrics)
    }
  ],
  "hallucinationMetrics": [
    {
      object (HallucinationMetrics)
    }
  ],
  "toolCallLatencyMetrics": [
    {
      object (ToolCallLatencyMetrics)
    }
  ],
  "turnLatencyMetrics": [
    {
      object (TurnLatencyMetrics)
    }
  ]
}
Fields
turnIndex

integer

Output only. The turn index (0-based).

toolMetrics[]

object (ToolMetrics)

Output only. Metrics for each tool within this turn.

semanticSimilarityMetrics[]

object (SemanticSimilarityMetrics)

Output only. Metrics for semantic similarity within this turn.

hallucinationMetrics[]

object (HallucinationMetrics)

Output only. Metrics for hallucination within this turn.

toolCallLatencyMetrics[]

object (ToolCallLatencyMetrics)

Output only. Metrics for tool call latency within this turn.

turnLatencyMetrics[]

object (TurnLatencyMetrics)

Output only. Metrics for turn latency within this turn.

Tool Annotations

Destructive Hint: ❌ | Idempotent Hint: ✅ | Read Only Hint: ✅ | Open World Hint: ❌