public sealed class GenerateOptimizedManifestRequest : IMessage<GenerateOptimizedManifestRequest>, IEquatable<GenerateOptimizedManifestRequest>, IDeepCloneable<GenerateOptimizedManifestRequest>, IBufferMessage, IMessageReference documentation and code samples for the GKE Recommender v1 API class GenerateOptimizedManifestRequest.
Request message for [GkeInferenceQuickstart.GenerateOptimizedManifest][google.cloud.gkerecommender.v1.GkeInferenceQuickstart.GenerateOptimizedManifest].
Implements
IMessageGenerateOptimizedManifestRequest, IEquatableGenerateOptimizedManifestRequest, IDeepCloneableGenerateOptimizedManifestRequest, IBufferMessage, IMessageNamespace
Google.Cloud.GkeRecommender.V1Assembly
Google.Cloud.GkeRecommender.V1.dll
Constructors
GenerateOptimizedManifestRequest()
public GenerateOptimizedManifestRequest()GenerateOptimizedManifestRequest(GenerateOptimizedManifestRequest)
public GenerateOptimizedManifestRequest(GenerateOptimizedManifestRequest other)| Parameter | |
|---|---|
| Name | Description |
other |
GenerateOptimizedManifestRequest |
Properties
AcceleratorType
public string AcceleratorType { get; set; }Required. The accelerator type. Use
[GkeInferenceQuickstart.FetchProfiles][google.cloud.gkerecommender.v1.GkeInferenceQuickstart.FetchProfiles]
to find valid accelerators for a given model_server_info.
| Property Value | |
|---|---|
| Type | Description |
string |
|
KubernetesNamespace
public string KubernetesNamespace { get; set; }Optional. The kubernetes namespace to deploy the manifests in.
| Property Value | |
|---|---|
| Type | Description |
string |
|
ModelServerInfo
public ModelServerInfo ModelServerInfo { get; set; }Required. The model server configuration to generate the manifest for. Use [GkeInferenceQuickstart.FetchProfiles][google.cloud.gkerecommender.v1.GkeInferenceQuickstart.FetchProfiles] to find valid configurations.
| Property Value | |
|---|---|
| Type | Description |
ModelServerInfo |
|
PerformanceRequirements
public PerformanceRequirements PerformanceRequirements { get; set; }Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.
| Property Value | |
|---|---|
| Type | Description |
PerformanceRequirements |
|
StorageConfig
public StorageConfig StorageConfig { get; set; }Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.
| Property Value | |
|---|---|
| Type | Description |
StorageConfig |
|