MiniMax models

MiniMax models are available for use as managed APIs and self-deployed models on Vertex AI. You can stream your responses to reduce the end-user latency perception. A streamed response uses server-sent events (SSE) to incrementally stream the response.

Managed MiniMax models

MiniMax models offer fully managed and serverless models as APIs. To use a MiniMax model on Vertex AI, send a request directly to the Vertex AI API endpoint. When using MiniMax models as a managed API, there's no need to provision or manage infrastructure.

The following models are available from MiniMax to use in Vertex AI. To access a MiniMax model, go to its Model Garden model card.

MiniMax M2

MiniMax M2 is a model from MiniMax that's designed for agentic and code-related tasks. It is built for end-to-end development workflows and has strong capabilities in planning and executing complex tool-calling tasks. The model is optimized to provide a balance of performance, cost, and inference speed.

Go to the MiniMax M2 model card

Use MiniMax models

For managed models, you can use curl commands to send requests to the Vertex AI endpoint using the following model names:

For MiniMax M2, use minimax-m2-maas

To learn how to make streaming and non-streaming calls to MiniMax models, see Call open model APIs.

To use a self-deployed Vertex AI model:

Navigate to the Model Garden console.
Find the relevant Vertex AI model.
Click Enable and complete the provided form to get the necessary commercial use licenses.

For more information about deploying and using partner models, see Deploy a partner model and make prediction requests .

MiniMax model region availability

MiniMax models are available in the following regions:

Model	Regions
MiniMax M2	`global` Max output: 196,608 Context length: 196,608

What's next

Learn how to Call open model APIs.

MiniMax models Stay organized with collections Save and categorize content based on your preferences.