GKE Recommender v1 API - Class PerformanceStats (1.0.0-beta01)

public sealed class PerformanceStats : IMessage<PerformanceStats>, IEquatable<PerformanceStats>, IDeepCloneable<PerformanceStats>, IBufferMessage, IMessage

Reference documentation and code samples for the GKE Recommender v1 API class PerformanceStats.

Performance statistics for a model deployment.

Inheritance

object > PerformanceStats

Namespace

Google.Cloud.GkeRecommender.V1

Assembly

Google.Cloud.GkeRecommender.V1.dll

Constructors

PerformanceStats()

public PerformanceStats()

PerformanceStats(PerformanceStats)

public PerformanceStats(PerformanceStats other)
Parameter
Name Description
other PerformanceStats

Properties

Cost

public RepeatedField<Cost> Cost { get; }

Output only. The cost of running the model deployment.

Property Value
Type Description
RepeatedFieldCost

NtpotMilliseconds

public int NtpotMilliseconds { get; set; }

Output only. The Normalized Time Per Output Token (NTPOT) in milliseconds. This is the request latency normalized by the number of output tokens, measured as request_latency / total_output_tokens.

Property Value
Type Description
int

OutputTokensPerSecond

public int OutputTokensPerSecond { get; set; }

Output only. The number of output tokens per second. This is the throughput measured as total_output_tokens_generated_by_server / elapsed_time_in_seconds.

Property Value
Type Description
int

QueriesPerSecond

public float QueriesPerSecond { get; set; }

Output only. The number of queries per second. Note: This metric can vary widely based on context length and may not be a reliable measure of LLM throughput.

Property Value
Type Description
float

TtftMilliseconds

public int TtftMilliseconds { get; set; }

Output only. The Time To First Token (TTFT) in milliseconds. This is the time it takes to generate the first token for a request.

Property Value
Type Description
int