Structured outputs enable a model to generate output that always adheres to a specific schema. For example, a model may be provided with a response schema to ensure that the response produces valid JSON. All open models available on the Vertex AI Model as a Service (MaaS) support structured outputs.
For more conceptual information about the structured output capability, see Introduction to structured output.
Use structured outputs
The following use case sets a response schema that ensures that the model output is a JSON object with the following properties: name, date, and participants. The Python code uses the OpenAI SDK and Pydantic objects to generate the JSON schema.
from pydantic import BaseModel
from openai import OpenAI
client = OpenAI()
class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]
completion = client.beta.chat.completions.parse(
    model="MODEL_NAME",
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ],
    response_format=CalendarEvent,
)
print(completion.choices[0].message.parsed)
The model output will adhere to the following JSON schema:
{ "name": STRING, "date": STRING, "participants": [STRING] }
When provided the prompt, "Alice and Bob are going to a science fair on Friday", the model could produce the following response:
{
  "name": "science fair",
  "date": "Friday",
  "participants": [
    "Alice",
    "Bob"
  ]
}
Detailed example
The following code is an example of a recursive schema. The UI class contains
a list of children, which can also be of the UI class.
from pydantic import BaseModel
from openai import OpenAI
from enum import Enum
from typing import List
client = OpenAI()
class UIType(str, Enum):
  div = "div"
  button = "button"
  header = "header"
  section = "section"
  field = "field"
  form = "form"
class Attribute(BaseModel):
  name: str
  value: str
class UI(BaseModel):
  type: UIType
  label: str
  children: List["UI"]
  attributes: List[Attribute]
UI.model_rebuild() # This is required to enable recursive types
class Response(BaseModel):
  ui: UI
completion = client.beta.chat.completions.parse(
  model="MODEL_NAME",
  messages=[
    {"role": "system", "content": "You are a UI generator AI. Convert the user input into a UI."},
    {"role": "user", "content": "Make a User Profile Form"}
  ],
  response_format=Response,
)
print(completion.choices[0].message.parsed)
The model output will adhere to the schema of the Pydantic object specified in the previous snippet. In this example, the model could generate the following UI form:
Form
  Input
    Name
    Email
    Age
A response could look like the following:
ui = UI(
    type=UIType.div,
    label='Form',
    children=[
        UI(
            type=UIType.div,
            label='Input',
            children=[],
            attributes=[
                Attribute(name='label', value='Name')
            ]
        ),
        UI(
            type=UIType.div,
            label='Input',
            children=[],
            attributes=[
                Attribute(name='label', value='Email')
            ]
        ),
        UI(
            type=UIType.div,
            label='Input',
            children=[],
            attributes=[
                Attribute(name='label', value='Age')
            ]
        )
    ],
    attributes=[
        Attribute(name='name', value='John Doe'),
        Attribute(name='email', value='john.doe@example.com'),
        Attribute(name='age', value='30')
    ]
)
Get JSON object responses
You can constrain the model to output only syntactically valid JSON objects by
setting the response_format field to { "type": "json_object" }. This is
often called JSON mode. JSON mode is is useful when generating JSON for use
with function calling or other downstream tasks that require JSON input.
When JSON mode is enabled, the model is constrained to only generate strings that parse into valid JSON objects. While this mode ensures the output is syntactically correct JSON, it does not enforce any specific schema. To ensure the model outputs JSON that follows a specific schema, you must include instructions in the prompt, as shown in the following example.
The following samples show how to enable JSON mode and instruct the model to return a JSON object with a specific structure:
Python
Before trying this sample, follow the Python setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Python API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Before running this sample, make sure to set the OPENAI_BASE_URL environment variable.
For more information, see Authentication and credentials.
from openai import OpenAI client = OpenAI() response = client.chat.completions.create( model="MODEL", response_format={ "type": "json_object" }, messages=[ {"role": "user", "content": "List 5 rivers in South America. Your response must be a JSON object with a single key \"rivers\", which has a list of strings as its value."}, ] ) print(response.choices[0].message.content)
Replace MODEL with the model name you want to use,
    for example meta/llama3-405b-instruct-maas.
REST
After you set up your environment, you can use REST to test a text prompt. The following sample sends a request to the publisher model endpoint.
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your Google Cloud project ID.
- LOCATION: A region that supports open models.
- MODEL: The model name you want to use,
    for example meta/llama3-405b-instruct-maas.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions
Request JSON body:
{
  "model": "MODEL",
  "response_format": {
    "type": "json_object"
  },
  "messages": [
    {
      "role": "user",
      "content": "List 5 rivers in South America. Your response must be a JSON object with a single key \"rivers\", which has a list of strings as its value."
    }
  ]
}
To send your request, choose one of these options:
curl
      Save the request body in a file named request.json,
      and execute the following command:
    
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions"
PowerShell
      Save the request body in a file named request.json,
      and execute the following command:
    
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
What's next
- Learn about Function calling.
- Learn about Thinking.
- Learn about Batch predictions.