Deploy partner models from Model Garden

Model Garden lets you self-deploy select partner models (preview). Self-deployed models aren't serverless. You must deploy them on Vertex AI before you use them. These models deploy securely within your Google Cloud project and VPC network. For more information about self-deployed models, see the self-deployed models documentation.

Purchase self-deployable partner models

To deploy self-deployable partner models on Vertex AI, you must first purchase them through Google Cloud Marketplace. To purchase a self-deployed partner model, do the following:

Go to Model Garden.

Go to Model Garden
In Model collections, click Self-deployable partner models to filter the list of models.
Click the model card of the partner model that you want to purchase.
Click Contact sales.
Complete the form and submit your request.

After completing these steps, you'll be connected with a Google Cloud sales representative to finalize the purchase.

Deploy models

After purchasing a self-deployable partner model, you can deploy it to a Vertex AI Endpoint using one-click deployment. This process simplifies deployment by pre-configuring the necessary settings.

You can perform one-click deployment using either the Google Cloud console or the Vertex AI SDK for Python.

Console

To deploy a partner model in the Google Cloud console, do the following:

Go to Model Garden.

Go to Model Garden
Locate and click the model card of the partner model that you want to use.
Click Deploy model.
Configure your deployment settings as prompted.
Click Deploy.

Python

The following sample shows how to deploy a partner model using the Vertex AI SDK for Python. Replace the placeholder values with your specific information.

import vertexai
from vertexai import model_garden

vertexai.init(project="PROJECT_ID", location="LOCATION")

# Replace with the actual partner model ID from Model Garden
model = model_garden.OpenModel("PARTNER_MODEL_ID")
endpoint = model.deploy(
  accept_eula=True,
  machine_type="MACHINE_TYPE",  # e.g., "a3-ultragpu-8g"
  accelerator_type="ACCELERATOR_TYPE",  # e.g., "NVIDIA_H200_141GB"
  accelerator_count=ACCELERATOR_COUNT,  # e.g., 8
  serving_container_image_uri="SERVING_CONTAINER_IMAGE_URI",
  endpoint_display_name="ENDPOINT_DISPLAY_NAME",
  model_display_name="MODEL_DISPLAY_NAME",
  use_dedicated_endpoint=True,
)
print(f"Model deployed to endpoint: {endpoint.resource_name}")