Deploy open models from Model Garden

Model Garden lets you self-deploy open models. Self-deployed models aren't serverless. You must deploy them on Vertex AI before use. These models deploy securely within your Google Cloud project and VPC network. For more information about self-deployed models, see the self-deployed models documentation.

For information on deploying partner models, see Deploy partner models from Model Garden.

Self-deployable open models

Open models in Model Garden might be available both as a managed API (MaaS) and as a self-deployable model. When both offerings are available for a given model, the model card for the managed API will have API Service in its name while the self-deployable model won't.

List models

To get a list of self-deployable open models, do the following:

  1. Go to Model Garden.

    Go to Model Garden

  2. In Features filter, select Open models and One-click deployment.

Deploy models

After identifying the open model that you want to deploy, you can deploy the model to a Vertex AI Endpoint by using one-click deployment. You can perform one-click deployment by using the Google Cloud console or by using the Vertex AI SDK for Python.

Console

To deploy a model in the Google Cloud console, do the following:

  1. Go to Model Garden.

    Go to Model Garden

  2. Locate and click the model card of the model that you want to use.

  3. Click Deploy model.

  4. Configure your deployment based on the provided instructions.

  5. Click Deploy.

Python

The following sample shows you how to deploy a model by using the Vertex AI SDK for Python.

import vertexai
from vertexai import model_garden

vertexai.init(project="PROJECT_ID", location="asia-south2")

model = model_garden.OpenModel("meta/llama3-3@llama-3.3-70b-instruct-fp8")
endpoint = model.deploy(
  accept_eula=True,
  machine_type="a3-ultragpu-8g",
  accelerator_type="NVIDIA_H200_141GB",
  accelerator_count=8,
  serving_container_image_uri="us-docker.pkg.dev/deeplearning-platform-release/vertex-model-garden/tensorrt-llm.cu128.0-18.ubuntu2404.py312:20250605-1800-rc0",
  endpoint_display_name="llama-3-3-70b-instruct-fp8-mg-one-click-deploy",
  model_display_name="llama-3-3-70b-instruct-fp8-1752269273562",
  use_dedicated_endpoint=True,
)

Deploy models with custom weights

Model Garden lets you deploy supported models with custom weights from a Cloud Storage bucket. For more information about deploying models with custom weights, see Deploy models with custom weights. You can deploy custom weights by using the Google Cloud console, the Google Cloud CLI, the Vertex AI API, or the Vertex AI SDK for Python.

What's next