Class GenerateOptimizedManifestRequest (0.2.0)

GenerateOptimizedManifestRequest(
    mapping=None, *, ignore_unknown_fields=False, **kwargs
)

Request message for GkeInferenceQuickstart.GenerateOptimizedManifest.

Attributes
Name	Description
`model_server_info`	`google.cloud.gkerecommender_v1.types.ModelServerInfo` Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations.
`accelerator_type`	`str` Required. The accelerator type. Use GkeInferenceQuickstart.FetchProfiles to find valid accelerators for a given `model_server_info`.
`kubernetes_namespace`	`str` Optional. The kubernetes namespace to deploy the manifests in.
`performance_requirements`	`google.cloud.gkerecommender_v1.types.PerformanceRequirements` Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.
`storage_config`	`google.cloud.gkerecommender_v1.types.StorageConfig` Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.

Class GenerateOptimizedManifestRequest (0.2.0) Stay organized with collections Save and categorize content based on your preferences.