Monitor and troubleshoot with AI assistance

This document describes how you can use AI assistance to help you monitor and troubleshoot your Cloud SQL resources. You can use the AI-assisted troubleshooting tools of Cloud SQL and Gemini Cloud Assist to troubleshoot slow queries and troubleshoot high database load.

Limitations

The following limitations apply to AI-assisted troubleshooting in Cloud SQL:

  • AI-assisted troubleshooting isn't supported for the following Cloud SQL configurations:
  • Query anomaly detection is available only for Cloud SQL Enterprise Plus edition instances.

Before you begin

  1. Ensure that Gemini Cloud Assist is set up for your Google Cloud user account and project.

    After you set up Gemini Cloud Assist, you might need to wait five minutes to let the service propagate before you can enable AI-assisted troubleshooting in Cloud SQL.

  2. Ensure that your instance is a Cloud SQL Enterprise Plus edition instance.
  3. Ensure that your Cloud SQL instance is using the new network architecture.
  4. Enable query insights for Cloud SQL Enterprise Plus edition and Cloud SQL Enterprise edition.
  5. MYSQL_VERSION.R20250304.00_01 or later must be installed on the Cloud SQL for MySQL instance. For more information about applying maintenance versions to an instance, see About maintenance on Cloud SQL instances.

Required roles and permissions

To get the permissions that you need to use AI-assisted troubleshooting, ask your administrator to grant you the following IAM roles on the project that hosts the Cloud SQL instance:

For more information about granting roles, see Manage access to projects, folders, and organizations.

These predefined roles contain the permissions required to use AI-assisted troubleshooting. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to use AI-assisted troubleshooting:

  • databaseinsights.performanceIssues.detect
  • databaseinsights.performanceIssues.investigate

You might also be able to get these permissions with custom roles or other predefined roles.

For more information about required roles and permissions for using Gemini Cloud Assist investigations, see Troubleshoot issues with Gemini Cloud Assist Investigations.

Enable AI-assisted troubleshooting

When you enable AI-assisted troubleshooting for your Cloud SQL instance, Cloud SQL can analyze the performance of your databases and detect anomalies in the execution of your queries. When Cloud SQL detects anomalies in query performance or identifies high system load, AI-assisted troubleshooting helps you analyze the situation with evidence and provides recommendations.

To enable AI-assisted troubleshooting for your Cloud SQL instance, do the following:

  1. In the Google Cloud console, go to the Cloud SQL Instances page.

    Go to Cloud SQL Instances

  2. To open the Overview page of an instance, click the instance name.
  3. In the Configuration tile, click Edit configuration.
  4. In the Customize your instance section, expand Query insights.
    1. If not already selected, select Enable Query insights.
    2. For Cloud SQL Enterprise Plus edition only, if not already selected, select Enable Enterprise Plus features.
  5. For Cloud SQL Enterprise Plus edition only, select Enable AI-assisted troubleshooting. For Cloud SQL Enterprise edition instances, troubleshooting with AI assistance is only available if you enable Gemini Cloud Assist.
  6. Click Save.
  7. For the best results, wait 24 hours after you enable AI-assisted troubleshooting in the Google Cloud console to let Cloud SQL build a baseline of the average performance of your instance, database, and queries.
  8. Your instance requires a restart. For more information about enabling query insights for Cloud SQL Enterprise Plus edition, see Use query insights to improve query performance.

Open Gemini Cloud Assist

To use Gemini Cloud Assist with Cloud SQL, do the following:

  1. In the Google Cloud console, go to the Cloud SQL Instances page.

    Go to Cloud SQL Instances

  2. To open the Overview page of an instance, click the instance name.
  3. In the navigation pane, select Query insights.
  4. To open the Cloud Assist panel, click Open or close Gemini Cloud Assist chat.
  5. In the Cloud Assist panel, enter a prompt that describes the information that you're interested in.
  6. After you enter the prompt, click Send prompt. Gemini returns a response to your prompt based on information from the last hour.

Troubleshoot slow queries

To use AI assistance with troubleshooting your slow queries, go to the Query insights dashboard for your Cloud SQL instance in Google Cloud console.

Top queries table

You can start troubleshooting slow queries with AI assistance in the Top queries table section of the Query insights dashboard.

Cloud SQL can help you identify which queries are performing slower than average during a specific detection time period. After you select a time range in the Query insights dashboard, Cloud SQL checks whether any queries are performing slower than average by using a detection time period of 24 hours before the end of your selected time range.

When you adjust the time range filter of the Database load chart, or any other filter such as database or user, Cloud SQL refreshes the Top queries table and reruns anomaly detection based on the new list of queries and an updated detection time period.

For Cloud SQL Enterprise Plus edition instances, the following occurs when Cloud SQL detects an anomaly:

If a query is running slower than expected, then a Warning warning_spark icon is displayed. When you click either icon, Gemini Cloud Assist is used to help analyze the query execution and offers observations about what might have caused any issue. Based on these observations, Gemini Cloud Assist generates a hypothesis that can help you address the issue.

To troubleshoot slow queries in the Top queries table in the Query insights dashboard, do the following:

  1. In the Google Cloud console, go to the Cloud SQL Instances page.

    Go to Cloud SQL Instances

  2. To open the Overview page of an instance, click the instance name.
  3. In the SQL navigation menu, click Query insights.
  4. In the Executed queries chart, use the Time range filter to select either 1 hour, 6 hours, 1 day, 7 days, 30 days or a custom range.
  5. In the Top queries table, under the Queries tab, review the list of queries for your database.
  6. If a Warning warning_spark icon appears next to the query's Avg execution time (ms) value for a query, then Cloud SQL has detected an anomaly in your query performance. Cloud SQL checks for anomalies within the 24-hour time period that occurs before the end of your selected time range.
  7. Click the Warning warning_spark icon.
  8. In the Query is slower than usual dialog, click New Investigation to start troubleshooting with AI assistance from Gemini Cloud Assist. After about two minutes, the Investigation details pane opens with the following sections:
    • Issue. A description of the issue being investigated, including the investigation’s start and stop time.
    • Observations. A list of observations about the issue. For example, these can include lock contention details, such as a longer than expected lock wait ratio for the query.
    • Hypotheses. A list of AI-recommended actions to take to help address the slow running query.
  9. If you want to see all investigations associated with the query, in the Query is slower than usual dialog, click View all investigations. The Gemini Cloud Assist page opens where you can view all currently running and previously completed investigations. You can filter the page by project or label, for example, to find the specific investigation you need.

    Alternatively, to see all previous investigations, click the Notifications icon, then select a notification associated with any investigation to open the Gemini Cloud Assist page.

  10. Alternatively, if you want to investigate the latency of any query, complete the following steps:
    1. Identify the specific query you want to investigate.
    2. In the Actions column, click the Actions icon associated with that query.
    3. Select Investigate latency in the menu to run a Gemini Cloud Assist investigation.

Query details

You can also troubleshoot a slow query with AI assistance from the Query details page.

  1. In the Google Cloud console, go to the Cloud SQL Instances page.

    Go to Cloud SQL Instances

  2. To open the Overview page of an instance, click the instance name.
  3. Click Query insights to open the Query insights dashboard.
  4. In the Query insights dashboard, click the query in the Top queries that you want to view. The Query details page appears.
  5. For Cloud SQL Enterprise Plus edition, if Cloud SQL detects an anomaly for the query, then one or more of the following indicators appears in the Query details page:
    • A message on the details screen that says This query is slower than usual and an Investigate option.
    • A message in the Query latency chart that says Query slower than usual. If this message appears, then click Investigate button to start troubleshooting with AI assistance from Gemini Cloud Assist.

      After about two minutes, the Investigation details pane opens with the following sections:

      • Issue. A description of the issue being investigated, including the investigation’s start and stop time.
      • Observations. A list of observations about the issue. For example, these can include lock contention details, such as a longer than expected lock wait ratio for the query.
      • Hypotheses. A list of AI-recommended actions to take to help address the slow running query.
  6. Optional: Use the Time range filter to select either 1 hour, 6 hours, 1 day, 7 days, 30 days or a custom range. When you adjust the Time range filter of the Query details page , or any other filter such as Database or User, Cloud SQL reruns anomaly detection.
  7. If Cloud SQL doesn't detect an anomaly for the query, then you can still run an analysis on the query by clicking the Investigate button in the Query latency card.

Analyze query latency

Using AI assistance, you can analyze and troubleshoot the details of your query latency.

Analysis time period

The analysis time period consists of the 24 hours that occur before the end of the time range that you select in the Database load chart of the Query insights dashboard or the Query details page. Cloud SQL uses this time period to compare baseline metrics with the metrics retrieved during the time period of the anomaly.

On the Query details page, for Cloud SQL Enterprise Plus edition, if Cloud SQL has detected an anomaly with the query, then after you select the query from the Query insights dashboard, Cloud SQL performs a baseline performance analysis for the query using the last 24 hours from the end of the anomaly. If Cloud SQL hasn't detected an anomaly with the query and runs anomaly detection on the query again, then Cloud SQL uses 48 hours before the end of the selected time range as the performance baseline for the analysis time period.

Detected anomaly period

The detected anomaly period is applicable to Cloud SQL Enterprise Plus edition instances only.

The detected anomaly period represents a time period when Cloud SQL finds an anomalous change in query performance. Cloud SQL uses the baseline performance measured for the query during the analysis time period.

If Cloud SQL detects multiple anomalies for a query within a selected time period, then Cloud SQL uses the last detected anomaly.

Examples of query performance prompts

You can also use Gemini Cloud Assist to enter prompts to help you improve the performance of your queries. Gemini Cloud Assist answers questions for the selected Cloud SQL instance and database.

Prompt Type of response
What are the top queries by latency in my database?
  • Summaries of queries sorted by latency. Gemini scopes the response by the time range filter selected in the query insights database load chart.
  • Guidance on how to identify and sort queries by latency.
What is the slowest query in this database instance? Guidance on how to identify the slowest query by latency.

Troubleshoot high database load

By accessing the Query insights dashboard in the Google Cloud console, you can analyze your database and troubleshoot events when your system experiences a higher database load than average. Cloud SQL uses the 24 hours of data that occurs prior to your selected time range to calculate the expected load of your database. You can look into the reasons for the higher load events and analyze the evidence behind reduced performance. Cloud SQL also provides recommendations for optimizing your database to improve performance.

To use AI assistance with troubleshooting high database load, go to the Instance Overview page or the Query insights dashboard in the Google Cloud console.

Instance overview page

Troubleshoot high database load with AI assistance in the Instance overview page by using the following steps:

  1. In the Google Cloud console, go to the Cloud SQL Instances page.

    Go to Cloud SQL Instances

  2. To open the Overview page of an instance, click the instance name.
  3. In the Overview page, from the Chart menu, select a metric for the database. You can select any metric, for example, CPU utilization.
  4. Optional: To select a specific analysis time period, use the Time range filter to select either 1 hour, 6 hours, 1 day, 7 days, 30 days or a custom range.

    You can zoom in to specific sections of the chart where you notice areas of high load that you want to analyze. For example, an area of high load might display CPU utilization levels closer to 100%. To zoom in, you can click and select a portion of the chart.

    Click the Investigate performance button to start troubleshooting high database load with AI assistance from Gemini Cloud Assist.

    After about two minutes, the Investigation details pane opens with the following sections:

    • Issue. A description of the issue being investigated, including the investigation’s start and stop time.
    • Observations. A list of observations about the issue. For example, these can include lock contention details, such as a longer than expected lock wait ratio for the query.
    • Hypotheses. A list of AI-recommended actions to take to help address the slow running query.

Query insights dashboard

Troubleshoot high database load with AI assistance in the Query insights dashboard using the following steps:

  1. In the Google Cloud console, go to the Cloud SQL Instances page.

    Go to Cloud SQL Instances

  2. To open the Overview page of an instance, click the instance name.
  3. Click Query insights to open the Query insights dashboard.
  4. Optional: Use the Time range filter to select either 1 hour, 6 hours, 1 day, 7 days, 30 days or a custom range.
  5. You can zoom in to specific sections of the chart where you notice areas of higher database load by query execution time. To zoom in, you can click and select a portion of the chart.

    In the Database load chart, click the Investigate performance button to start troubleshooting high database load with AI assistance from Gemini Cloud Assist.

    After about two minutes, the Investigation details pane opens with the following sections:

    • Issue. A description of the issue being investigated, including the investigation’s start and stop time.
    • Observations. A list of observations about the issue. For example, these can include lock contention details, such as a longer than expected lock wait ratio for the query.
    • Hypotheses. A list of AI-recommended actions to take to help address the slow running query.

Analyze high database load

Using AI assistance, you can analyze and troubleshoot the details of your database load.

Analysis time period

Cloud SQL analyzes your database for the time period that you select in your database load chart from the Query insights dashboard or the Instance overview page. If you select a time period of less than 24 hours, then Cloud SQL analyzes the entire time period. If you select a time period greater than 24 hours, then Cloud SQL selects only the last 24 hours of the time period for analysis.

To calculate the baseline performance analysis of your database, Cloud SQL includes 24 hours of a baseline time period in its analysis time period. If your selected time period occurs on a day other than Monday, then Cloud SQL uses a baseline time period of the 24 hours previous to your selected time period. If your selected time period occurs on a Monday, then Cloud SQL uses a baseline time period of the 7th day previous to your selected time period.

Metrics analysis

When Cloud SQL starts the analysis, Cloud SQL checks for significant changes in the various metrics, including but not limited to the following:

  • Queries per second (QPS)
  • CPU
  • Memory
  • Disk I/O

Cloud SQL compares the baseline aggregated data for your database within the performance data of your analysis time window. If Cloud SQL detects a significant change in threshold for a key metric, then Cloud SQL indicates a possible situation with your database. The identified situation might explain a root cause for the high load on your database over the selected time period.

Recommendations

When Gemini Cloud Assist completes analysis, the Hypotheses section of the Investigation details pane lists actionable insights to help remediate the issue.

For some situations, based on the analysis, there might not be a recommendation.

Examples of system performance prompts

You can also use Gemini Cloud Assist to enter prompts to gather information about your system performance. Gemini Cloud Assist answers questions for the selected Cloud SQL instance.

Prompt Type of response
How many error log entries are there for this database instance in the last 7 days? Summary of log entries grouped by their severity type. Gemini scopes the response by the time range filter selected in the instance performance chart.
What was the CPU utilization for this database instance around 2 PM today? Metrics results in percentage range for CPU utilization within the time interval.

Troubleshoot connectivity issues

You can start troubleshooting connectivity issues by using Gemini Cloud Assist or by initiating an investigation when connection errors occur. AI assistance evaluates several sources to identify why a client might encounter issues when trying to connect to a Cloud SQL database.

Investigate connectivity issues

To use AI assistance with troubleshooting connectivity issues, do the following:

  1. In the Google Cloud console, go to the Cloud SQL Instances page.

    Go to Cloud SQL Instances

  2. To open the Overview page of an instance, click the instance name.
  3. In the Resolve database issues with AI-assisted troubleshooting pane, click Explore investigations.
  4. In the Investigation options window, look for the Connection usage section.
  5. Optional: Select a specific analysis time period using the Time range filter, either 1 hour, 6 hours, 1 day, 7 days, or a custom range.
  6. Click Investigate.

    Gemini initiates an automated analysis of your instance metadata, logs, and networking configuration. After the analysis is complete, the Investigation details pane displays the following sections:

    • Issue: A summary of the connectivity failure, including affected resources and timestamps.
    • Observations: Evidence gathered from signals such as when a database has reached its max_connections limit or active concurrent connections cross-referenced with instance metadata. Evidence can be used to determine whether a traffic spike or unclosed sessions might be the cause of instance downtime.
    • Hypotheses: AI-generated root causes and remediation steps.

Examples of connectivity issue prompts

You can also use Gemini Cloud Assist to troubleshoot connectivity issues between a client and your Cloud SQL instance.

Prompt Type of response
Why am I seeing connection errors? Gemini evaluates connections to your database and recommends improvements such as enabling managed connection pooling.

Get index recommendations

You can obtain index recommendations from Cloud SQL in query insights. For more information about obtaining index recommendations, see Use index advisor.

Examples of index recommendation prompts

Use Gemini Cloud Assist to get more information about how to use indexes in your databases. Gemini Cloud Assist answers questions for the selected Cloud SQL instance.

Prompt Type of response
Show index recommendations for queries run in the last 7 days. Guidance on the types of queries that can benefit from an index.

Monitor active queries

Use the Query insights dashboard to monitor active queries, and if necessary, terminate long-running processes. For more information, see Monitor active queries.

Examples of active query prompts

Use Gemini Cloud Assist to find out more information about queries that cause high latency or CPU load. Gemini Cloud Assist answers questions for the selected Cloud SQL instance.

Prompt Type of response
What are the top queries currently running in my database? Guidance on how to find the longest running and most resource-intensive queries.

What's next