This page describes model endpoint management. Model endpoint management lets you experiment with registering an AI model endpoint and invoking predictions. To use AI models in production environments, see Invoke online predictions from Cloud SQL instances.
After the model endpoints are added and registered in model endpoint management, you can reference them using the model ID to invoke predictions.
Before you begin
Make sure that you complete the following actions:
- Register your model endpoint with model endpoint management. For more information, see Register and call remote AI models using model endpoint management.
- Create or update your Cloud SQL instance so that the instance can integrate with Vertex AI. For more information, see Enable database integration with Vertex AI.
Invoke predictions for generic models
Use the mysql.ml_predict_row() SQL function to call a registered generic model endpoint to invoke
predictions. You can use mysql.ml_predict_row() function with any model type.
SELECT
  mysql.ml_predict_row(
    'MODEL_ID',
    'REQUEST_BODY');
Replace the following:
- MODEL_ID: the model ID you defined when registering the model endpoint
- REQUEST_BODY: the parameters to the prediction function, in JSON format
Examples
To generate predictions for a registered gemini-flash model endpoint, run the following statement:
  SELECT JSON_EXTRACT(
    mysql.ml_predict_row(
      'gemini-2.5-flash',
      '{
             "contents": [
             {
                "role": "user",
                "parts": [
                    {
                    "text": "For TPCH database schema as mentioned here https://www.tpc.org/TPC_Documents_Current_Versions/pdf/TPC-H_v3.0.1.pdf , generate a SQL query to find allsupplier names which are located in the India nation."
                    } ]}]
            }'
    ),
    '$.candidates[0].content.parts[0].text'
  );