Optimize prompts

This document describes how to use the Vertex AI prompt optimizer to automatically optimize prompt performance by improving the system instructions for a set of prompts.

The Vertex AI prompt optimizer can help you improve your prompts quickly at scale, without manually rewriting system instructions or individual prompts. This is useful when changing between models and you want to reuse system instructions and prompts.

The following approaches are available for optimizing prompts:

  • The zero-shot optimizer is a real-time low-latency optimizer that improves a single prompt or system instruction template. It is fast and requires no additional setup besides providing your original prompt or system instruction. The zero-shot optimizer is model-independent and can improve prompts for any Google model. Also, it provides a gemini_nano mode to specifically optimize prompts for smaller models, such as Gemini Nano and Gemma 3n E4B.
  • The few-shot optimizer is a real-time low-latency optimizer that refines system instructions by analyzing examples where a model's response did not meet expectations. By providing specific examples of prompts, model responses, and feedback on those responses, you can systematically improve prompt performance.
  • The data-driven optimizer is a batch task-level iterative optimizer that improves prompts by evaluating the model's response to sample labeled prompts against specified evaluation metrics for your selected target model. It's for more advanced optimization that lets you configure the optimization parameters and provide a few labeled samples. Also, the data-driven optimizer supports optimization for generally-available Gemini models, such as Gemini Nano and Gemma 3n E4B, and custom models deployed locally or from the Vertex AI Model Garden.

These methods are available to users through the Google Cloud console or the Vertex AI SDK.

What's next