Where Gemini in BigQuery processes your data

This document helps you understand where Gemini in BigQuery processes your data. This behavior applies to the following Gemini in BigQuery features:

For these features, Gemini processing occurs in the jurisdictional boundaries of the query location, or where the BigQuery dataset is stored. For example, if your BigQuery query location or dataset is in the europe-west1 region, Gemini processing occurs in a location within the EU jurisdictional boundary. This design minimizes data movement and follows data governance best practices. For more information about restrictions on available jurisdictions, see Limitations.

For most Gemini in BigQuery features, the Gemini processing location can be controlled by an administrator by using the Global Default Location setting at the project or organization level. BigQuery users can override this global default location by using the Query Location setting in BigQuery Studio. In cases where a query location setting isn't specified in configuration settings by an administrator or explicitly by the user in the query, Gemini in BigQuery uses the location derived from the query being edited. To learn more about how BigQuery determines query location see Run a query.

Gemini in BigQuery determines the jurisdiction of US or EU based upon these controls. If a jurisdiction cannot be determined, then the global processing location is used based upon the Gemini serving locations.

The following sections explain how you can manage where each Gemini in BigQuery feature processes your data.

SQL editor and data canvas

When you generate code using the SQL editor, or use data canvas to create data analysis workflows, Gemini in BigQuery uses the following logic to determine the processing location:

  • A BigQuery administrator can specify a default organization-level or project-level location. To learn how to specify a default location, see Specify the default organization-level or project-level location.

  • A BigQuery user can specify a query location in BigQuery Studio that overrides the administrator setting. To learn how to specify a default query location setting in BigQuery, see Specify locations.

  • If a dataset's location cannot be determined, or if the user's default query location is unspecified,BigQuery attempts to determine the location of the dataset or query based on the dry run. For example:

    • SQL editor example: If your Gemini request for Transform SQL with Gemini references a dataset in europe-west1, then Gemini processes the data in the EU jurisdictional boundary.
    • Data canvas example: If your data canvas visualizes data from a dataset located in us-east4, any Gemini in BigQuery analyses or suggestions are processed in the US jurisdictional boundaries.

Specify the default organization-level or project-level location

A BigQuery administrator can specify an organization-level or project-level default location where Gemini requests are processed. The default location is cached for the duration of the user's session while they are editing within the current SQL editor tab.

Prerequisite

To the specify the organization-level or project-level default location where data is processed, a BigQuery administrator must first opt in to the BigQuery feature by completing this form and then receive an email confirming that the feature was enabled.

Required roles

To specify a default organization or project location, you must be granted the BigQuery Admin role (roles/bigquery.admin), which includes the bigquery.config.update permission that is required to specify a configuration setting. For more information about granting roles, see Manage access to projects, folders, and organizations.

Set the default location

To set an organization-level or project-level default location, complete the following steps:

  1. In the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  2. In the navigation pane, click Explorer.

  3. Select the organization or project for which you want to specify a default location.

  4. In the BigQuery SQL editor, enter the following statement:

    • Organization-level settings:
      ALTER ORGANIZATION SET OPTIONS(default_location='my-default-region');
    • Project-level settings:
      ALTER PROJECT SET OPTIONS(default_location='my-default-region');

This command sets the value of default_location to my-default-region.

Verify default location for data processing

To verify the default location for data processing of a Gemini in BigQuery-assisted SQL query, follow these steps:

  1. In the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  2. In the BigQuery Studio SQL editor, run the following query:

    SELECT
        COALESCE(
            (
                SELECT
                    option_value
                FROM INFORMATION_SCHEMA.PROJECT_OPTIONS
                WHERE option_name = 'default_location'
            ),
            (
                SELECT
                    option_value
                FROM INFORMATION_SCHEMA.ORGANIZATION_OPTIONS
                WHERE option_name = 'default_location'
            ));

The result shows the default_location value set to the value you defined as my-default-region. This query returns the default location of the project if defined. Otherwise, the query returns the default location for the organization. The location where Gemini in BigQuery operations run is not explicitly specified by the user.

BigQuery data insights

To generate insights using BigQuery data insights, you can run data scan operations on selected tables and dataset resources. These scans are created in the same location as the BigQuery dataset resource. Within the US or EU jurisdictions, Gemini in BigQuery processing is restricted to the jurisdiction where the scan runs. Outside of the US and EU jurisdictions, processing runs globally. To learn about where global Gemini global data processing takes place, see Gemini serving locations.

BigQuery data preparation

The location where BigQuery data preparation processes data depends upon which data preparation feature you are using.

  • For standalone data preparation, the Gemini in BigQuery processing location is the location where the BigQuery dataset is located.
  • If you run data preparation as part of Dataform or BigQuery pipelines, then the Gemini in BigQuery data processing location is determined by the Dataform defaultLocation setting, if it's set. The defaultLocation setting also determines the BigQuery job location. This ensures that Gemini in BigQuery processing is done in the same jurisdictional boundaries.
  • If defaultLocation for Dataform or the BigQuery pipeline that contains your data preparation is not set, then the Gemini in BigQuery processing region is determined by using the repository's region setting. A pipeline without a defaultLocation setting specified can run different BigQuery jobs in different locations based on the location of the tables used in pipeline nodes. As a best practice, you should set defaultLocation to ensure a consistent processing location.

Limitations

The following limitations apply when you identify where Gemini in BigQuery processes data:

  • Gemini in BigQuery doesn't provide data residency for individual locations. Data processing can be specified for US and EU supported jurisdictions. Data outside these jurisdictions is processed globally.
  • Gemini in BigQuery jurisdiction processing is only available for Gemini in BigQuery features that are generally available (GA). For a list of Gemini in BigQuery features, see Overview of Gemini in BigQuery.
  • BigQuery Python notebook code assist and the Data Science Agent for Colab Enterprise in BigQuery only support global Gemini processing.

  • Gemini in Cloud Assist chat (GCA) only supports global Gemini processing. You can deny access to the GCA chat panel by removing the cloudaicompanion.instances.completeTask Identity and Access Management (IAM) permission for your users. To learn more about how to create custom roles, see Create and manage custom roles.

What's next