Where Gemini in BigQuery processes your data
This document helps you understand where Gemini in BigQuery processes your data. This behavior applies to the following Gemini in BigQuery features:
For these features, Gemini processing occurs in the
jurisdictional boundaries of the query location, or where the
BigQuery dataset is stored. For example, if your
BigQuery query location or dataset is in the europe-west1
region, Gemini processing occurs in a location within the EU
jurisdictional boundary. This design minimizes data movement and follows data
governance best practices. For more information about restrictions on available
jurisdictions, see Limitations.
For most Gemini in BigQuery features, the Gemini processing location can be controlled by an administrator by using the Global Default Location setting at the project or organization level. BigQuery users can override this global default location by using the Query Location setting in BigQuery Studio. In cases where a query location setting isn't specified in configuration settings by an administrator or explicitly by the user in the query, Gemini in BigQuery uses the location derived from the query being edited. To learn more about how BigQuery determines query location see Run a query.
Gemini in BigQuery determines the jurisdiction of
US or EU based upon these controls. If a jurisdiction cannot be determined,
then the global processing location is used based upon the
Gemini serving locations.
The following sections explain how you can manage where each Gemini in BigQuery feature processes your data.
SQL editor and data canvas
When you generate code using the SQL editor, or use data canvas to create data analysis workflows, Gemini in BigQuery uses the following logic to determine the processing location:
A BigQuery administrator can specify a default organization-level or project-level location. To learn how to specify a default location, see Specify the default organization-level or project-level location.
A BigQuery user can specify a query location in BigQuery Studio that overrides the administrator setting. To learn how to specify a default query location setting in BigQuery, see Specify locations.
If a dataset's location cannot be determined, or if the user's default query location is unspecified,BigQuery attempts to determine the location of the dataset or query based on the dry run. For example:
- SQL editor example: If your Gemini request for
Transform SQL with Gemini references a dataset in
europe-west1, then Gemini processes the data in theEUjurisdictional boundary. - Data canvas example: If your data canvas visualizes data from a
dataset located in
us-east4, any Gemini in BigQuery analyses or suggestions are processed in theUSjurisdictional boundaries.
- SQL editor example: If your Gemini request for
Transform SQL with Gemini references a dataset in
Specify the default organization-level or project-level location
A BigQuery administrator can specify an organization-level or project-level default location where Gemini requests are processed. The default location is cached for the duration of the user's session while they are editing within the current SQL editor tab.
Prerequisite
To the specify the organization-level or project-level default location where data is processed, a BigQuery administrator must first opt in to the BigQuery feature by completing this form and then receive an email confirming that the feature was enabled.
Required roles
To specify a default organization or project location, you must be granted the
BigQuery Admin role
(roles/bigquery.admin), which includes the bigquery.config.update permission
that is required to specify a configuration setting. For more information about
granting roles, see Manage access to projects, folders, and
organizations.
Set the default location
To set an organization-level or project-level default location, complete the following steps:
In the Google Cloud console, go to the BigQuery page.
In the navigation pane, click Explorer.
Select the organization or project for which you want to specify a default location.
In the BigQuery SQL editor, enter the following statement:
- Organization-level settings:
ALTER ORGANIZATION SET OPTIONS(default_location='my-default-region');
- Project-level settings:
ALTER PROJECT SET OPTIONS(default_location='my-default-region');
- Organization-level settings:
This command sets the value of default_location to my-default-region.
Verify default location for data processing
To verify the default location for data processing of a Gemini in BigQuery-assisted SQL query, follow these steps:
In the Google Cloud console, go to the BigQuery page.
In the BigQuery Studio SQL editor, run the following query:
SELECT COALESCE( ( SELECT option_value FROM INFORMATION_SCHEMA.PROJECT_OPTIONS WHERE option_name = 'default_location' ), ( SELECT option_value FROM INFORMATION_SCHEMA.ORGANIZATION_OPTIONS WHERE option_name = 'default_location' ));
The result shows the default_location value set to the value you defined as
my-default-region. This query returns the default location of the project if
defined. Otherwise, the query returns the default location for the organization.
The location where Gemini in BigQuery operations
run is not explicitly specified by the user.
BigQuery data insights
To generate insights using BigQuery data
insights, you can run data scan operations on
selected tables and dataset resources. These scans are created in the same
location as the BigQuery dataset resource. Within the US or
EU jurisdictions, Gemini in BigQuery processing
is restricted to the jurisdiction where the scan runs. Outside of the US and EU
jurisdictions, processing runs globally. To learn about where global
Gemini global data processing takes place, see
Gemini serving locations.
BigQuery data preparation
The location where BigQuery data preparation processes data depends upon which data preparation feature you are using.
- For standalone data preparation, the Gemini in BigQuery processing location is the location where the BigQuery dataset is located.
- If you run data preparation as part of Dataform or
BigQuery pipelines, then the Gemini in
BigQuery data processing location is determined by the
Dataform
defaultLocationsetting, if it's set. ThedefaultLocationsetting also determines the BigQuery job location. This ensures that Gemini in BigQuery processing is done in the same jurisdictional boundaries. - If
defaultLocationfor Dataform or the BigQuery pipeline that contains your data preparation is not set, then the Gemini in BigQuery processing region is determined by using the repository's region setting. A pipeline without adefaultLocationsetting specified can run different BigQuery jobs in different locations based on the location of the tables used in pipeline nodes. As a best practice, you should setdefaultLocationto ensure a consistent processing location.
Limitations
The following limitations apply when you identify where Gemini in BigQuery processes data:
- Gemini in BigQuery doesn't provide data
residency for individual locations. Data processing can be specified for
USandEUsupported jurisdictions. Data outside these jurisdictions is processed globally. - Gemini in BigQuery jurisdiction processing is only available for Gemini in BigQuery features that are generally available (GA). For a list of Gemini in BigQuery features, see Overview of Gemini in BigQuery.
BigQuery Python notebook code assist and the Data Science Agent for Colab Enterprise in BigQuery only support global Gemini processing.
Gemini in Cloud Assist chat (GCA) only supports global Gemini processing. You can deny access to the GCA chat panel by removing the
cloudaicompanion.instances.completeTaskIdentity and Access Management (IAM) permission for your users. To learn more about how to create custom roles, see Create and manage custom roles.
What's next
- Read the Gemini in BigQuery overview.
- Learn how to set up Gemini in BigQuery.
- Learn how to write queries with Gemini assistance.
- Learn more about Google Cloud compliance.
- Learn about security, privacy, and compliance for Gemini in BigQuery.
- Learn more about how Gemini for Google Cloud uses your data.