GKE Recommender v1 API - Class GenerateOptimizedManifestRequest (1.0.0-beta01)

public sealed class GenerateOptimizedManifestRequest : IMessage<GenerateOptimizedManifestRequest>, IEquatable<GenerateOptimizedManifestRequest>, IDeepCloneable<GenerateOptimizedManifestRequest>, IBufferMessage, IMessage

Reference documentation and code samples for the GKE Recommender v1 API class GenerateOptimizedManifestRequest.

Request message for [GkeInferenceQuickstart.GenerateOptimizedManifest][google.cloud.gkerecommender.v1.GkeInferenceQuickstart.GenerateOptimizedManifest].

Inheritance

object > GenerateOptimizedManifestRequest

Namespace

Google.Cloud.GkeRecommender.V1

Assembly

Google.Cloud.GkeRecommender.V1.dll

Constructors

GenerateOptimizedManifestRequest()

public GenerateOptimizedManifestRequest()

GenerateOptimizedManifestRequest(GenerateOptimizedManifestRequest)

public GenerateOptimizedManifestRequest(GenerateOptimizedManifestRequest other)
Parameter
Name Description
other GenerateOptimizedManifestRequest

Properties

AcceleratorType

public string AcceleratorType { get; set; }

Required. The accelerator type. Use [GkeInferenceQuickstart.FetchProfiles][google.cloud.gkerecommender.v1.GkeInferenceQuickstart.FetchProfiles] to find valid accelerators for a given model_server_info.

Property Value
Type Description
string

KubernetesNamespace

public string KubernetesNamespace { get; set; }

Optional. The kubernetes namespace to deploy the manifests in.

Property Value
Type Description
string

ModelServerInfo

public ModelServerInfo ModelServerInfo { get; set; }

Required. The model server configuration to generate the manifest for. Use [GkeInferenceQuickstart.FetchProfiles][google.cloud.gkerecommender.v1.GkeInferenceQuickstart.FetchProfiles] to find valid configurations.

Property Value
Type Description
ModelServerInfo

PerformanceRequirements

public PerformanceRequirements PerformanceRequirements { get; set; }

Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.

Property Value
Type Description
PerformanceRequirements

StorageConfig

public StorageConfig StorageConfig { get; set; }

Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.

Property Value
Type Description
StorageConfig