Generate text with the AI.GENERATE function
This tutorial shows you how to generate text from text or multimodal data
by using the
AI.GENERATE function
with a Gemini model.
This tutorial shows you how to complete the following tasks:
- Summarize text content and output results in the function's default format.
- Summarize text content and output structured results.
- Transcribe and translate video content.
- Analyze audio file content.
Costs
In this document, you use the following billable components of Google Cloud:
- BigQuery ML: You incur costs for the data that you process in BigQuery.
- Vertex AI: You incur costs for calls to the Vertex AI model.
To generate a cost estimate based on your projected usage,
use the pricing calculator.
For more information about BigQuery pricing, see BigQuery pricing in the BigQuery documentation.
For more information about Vertex AI generative AI pricing, see the Vertex AI pricing page.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator role
(
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles.
-
Verify that billing is enabled for your Google Cloud project.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator role
(
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles.
-
Verify that billing is enabled for your Google Cloud project.
- BigQuery is automatically enabled in new projects.
To activate BigQuery in a pre-existing project, go to
Enable the BigQuery API.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission. Learn how to grant roles.
Required roles
To use the AI.GENERATE function, you need the
following Identity and Access Management (IAM) roles:
- Create and use BigQuery datasets and tables:
BigQuery Data Editor (
roles/bigquery.dataEditor) on your project. - Create, delegate, and use BigQuery connections:
BigQuery Connections Admin (
roles/bigquery.connectionsAdmin) on your project. - Grant permissions to the connection's service account: Project IAM Admin
(
roles/resourcemanager.projectIamAdmin) on the project that contains the Vertex AI endpoint. - Create BigQuery jobs: BigQuery Job User
(
roles/bigquery.jobUser) on your project.
These predefined roles contain the permissions required to perform the tasks in this document. To see the exact permissions that are required, expand the Required permissions section:
Required permissions
- Create a dataset:
bigquery.datasets.create - Create, delegate, and use a connection:
bigquery.connections.* - Set service account permissions:
resourcemanager.projects.getIamPolicyandresourcemanager.projects.setIamPolicy - Query table data:
bigquery.tables.getData
You might also be able to get these permissions with custom roles or other predefined roles.
Create a dataset
Create a BigQuery dataset to store your ML model.
Console
In the Google Cloud console, go to the BigQuery page.
In the Explorer pane, click your project name.
Click View actions > Create dataset
On the Create dataset page, do the following:
For Dataset ID, enter
bqml_tutorial.For Location type, select Multi-region, and then select US (multiple regions in United States).
Leave the remaining default settings as they are, and click Create dataset.
bq
To create a new dataset, use the
bq mk command
with the --location flag. For a full list of possible parameters, see the
bq mk --dataset command
reference.
Create a dataset named
bqml_tutorialwith the data location set toUSand a description ofBigQuery ML tutorial dataset:bq --location=US mk -d \ --description "BigQuery ML tutorial dataset." \ bqml_tutorial
Instead of using the
--datasetflag, the command uses the-dshortcut. If you omit-dand--dataset, the command defaults to creating a dataset.Confirm that the dataset was created:
bq ls
API
Call the datasets.insert
method with a defined dataset resource.
{ "datasetReference": { "datasetId": "bqml_tutorial" } }
BigQuery DataFrames
Before trying this sample, follow the BigQuery DataFrames setup instructions in the BigQuery quickstart using BigQuery DataFrames. For more information, see the BigQuery DataFrames reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up ADC for a local development environment.
Create a connection
Create a Cloud resource connection and get the connection's service account. Create the connection in the same location as the dataset that you created in the previous step.
Follow these steps to create a connection:
Go to the BigQuery page.
In the Explorer pane, click Add data:
The Add data dialog opens.
In the Filter By pane, in the Data Source Type section, select Business Applications.
Alternatively, in the Search for data sources field, you can enter
Vertex AI.In the Featured data sources section, click Vertex AI.
Click the Vertex AI Models: BigQuery Federation solution card.
In the Connection type list, select Vertex AI remote models, remote functions, BigLake and Spanner (Cloud Resource).
In the Connection ID field, type
test_connection.Click Create connection.
Click Go to connection.
In the Connection info pane, copy the service account ID for use in the next step.
Give the service account access
Grant the connection's service account the Vertex AI User role.
To grant the role, follow these steps:
Go to the IAM & Admin page.
Click Add.
The Add principals dialog opens.
In the New principals field, enter the service account ID that you copied earlier.
In the Select a role field, select Vertex AI, and then select Vertex AI User.
Click Add another role.
In the Select a role field, choose Cloud Storage, and then select Storage Object Viewer.
Click Save.
Summarize text and use the default output format
Follow these steps to generate text using the AI.GENERATE function,
and output the results in the AI.GENERATE function's default format:
In the Google Cloud console, go to the BigQuery page.
In the query editor, run the following query:
WITH bbc_news AS ( SELECT body FROM `bigquery-public-data.bbc_news.fulltext` LIMIT 5 ) SELECT AI.GENERATE(body, connection_id => 'us.test_connection') AS news FROM bbc_news;
The output is similar to the following:
+---------------------------------------------+------------------------------------+---------------+ | news.result | news.full_response | news.status | +---------------------------------------------+------------------------------------+---------------+ | This article presents a debate about the | {"candidates":[{"avg_logprobs": | | | "digital divide" between rich and poor | -0.31465074559841777, content": | | | nations. Here's a breakdown of the key.. | {"parts":[{"text":"This article.. | | +---------------------------------------------+------------------------------------+---------------+ | This article discusses how advanced | {"candidates":[{"avg_logprobs": | | | mapping technology is aiding humanitarian | -0.21313422900091983,"content": | | | efforts in Darfur, Sudan. Here's a... | {"parts":[{"text":"This article.. | | +---------------------------------------------+------------------------------------+---------------+ | ... | ... | ... | +---------------------------------------------+------------------------------------+---------------+
Summarize text and output structured results
Follow these steps to generate text using the AI.GENERATE function, and use
the AI.GENERATE function's output_schema argument to format the output:
In the Google Cloud console, go to the BigQuery page.
In the query editor, run the following query:
WITH bbc_news AS ( SELECT body FROM `bigquery-public-data`.bbc_news.fulltext LIMIT 5 ) SELECT news.good_sentiment, news.summary FROM bbc_news, UNNEST(ARRAY[AI.GENERATE(body, connection_id => 'us.test', output_schema => 'summary STRING, good_sentiment BOOL')]) AS news;
The output is similar to the following:
+----------------+--------------------------------------------+ | good_sentiment | summary | +----------------+--------------------------------------------+ | true | A World Bank report suggests the digital | | | divide is rapidly closing due to increased | | | access to technology in developing.. | +----------------+--------------------------------------------+ | true | A review of sports games, focusing on the | | | rivalry between EA Sports and ESPN, and | | | the recent deal where EA acquired the.. | +----------------+--------------------------------------------+ | ... | ... | +----------------+--------------------------------------------+
Transcribe and translate video content
Follow these steps to create an object table over public video content, and then transcribe and translate a video:
In the Google Cloud console, go to the BigQuery page.
In the query editor, run the following query to create the object table:
CREATE OR REPLACE EXTERNAL TABLE `bqml_tutorial.video` WITH CONNECTION `us.test_connection` OPTIONS ( object_metadata = 'SIMPLE', uris = ['gs://cloud-samples-data/generative-ai/video/*']);
In the query editor, run the following query to transcribe and translate the
pixel8.mp4file:SELECT AI.GENERATE( (OBJ.GET_ACCESS_URL(ref, 'r'), 'Transcribe the video in Japanese and then translate to English.'), connection_id => 'us.test_connection', endpoint => 'gemini-2.5-flash', output_schema => 'japanese_transcript STRING, english_translation STRING' ).* EXCEPT (full_response, status) FROM `bqml_tutorial.video` WHERE REGEXP_CONTAINS(uri, 'pixel8.mp4');
The output is similar to the following:
+--------------------------------------------+--------------------------------+ | english_translation | japanese_transcript | +--------------------------------------------+--------------------------------+ | My name is Saeka Shimada. I'm a | 島田 さえか です 。 東京 で フ | | photographer in Tokyo. Tokyo has many | ォトグラファー を し て い ま | | faces. The city at night is totally... | す 。 東京 に は いろんな 顔 が | +--------------------------------------------+--------------------------------+
Analyze audio file content
Follow these steps to create an object table over public audio content, and then analyze the content of the audio files.
In the Google Cloud console, go to the BigQuery page.
In the query editor, run the following query to create the object table:
CREATE OR REPLACE EXTERNAL TABLE `bqml_tutorial.audio` WITH CONNECTION `us.test_connection` OPTIONS ( object_metadata = 'SIMPLE', uris = ['gs://cloud-samples-data/generative-ai/audio/*']);
In the query editor, run the following query to analyze the audio files:
SELECT AI.GENERATE( (OBJ.GET_ACCESS_URL(ref, 'r'), 'Summarize the content of this audio file.'), connection_id => 'us.test_connection', endpoint => 'gemini-2.5-flash', output_schema => 'topic ARRAY<STRING>, summary STRING' ).* EXCEPT (full_response, status), uri FROM `bqml_tutorial.audio`;
The results look similar to the following:
+--------------------------------------------+-----------------------------------------------------------+ | summary | topic | uri | +--------------------------------------------+-----------------------------------------------------------+ | The audio contains a distinctive 'beep' | beep sound | gs://cloud-samples-data/generativ... | | sound, followed by the characteristic | | | | sound of a large vehicle or bus backing.. | | | +--------------------------------------------+--------------------+--------------------------------------+ | | vehicle backing up | | | +--------------------+ | | | bus | | | +--------------------+ | | | alarm | | +--------------------------------------------+--------------------+--------------------------------------+ | The speaker introduces themselves | Introduction | gs://cloud-samples-data/generativ... | | as Gemini and expresses their excitement | | | | and readiness to dive into something.. | | | +--------------------------------------------+--------------------+--------------------------------------+ | | Readiness | | | +--------------------+ | | | Excitement | | | +--------------------+ | | | Collaboration | | +--------------------------------------------+--------------------+--------------------------------------+ | ... | ... | ... | +--------------------------------------------+--------------------+--------------------------------------+
Clean up
- In the Google Cloud console, go to the Manage resources page.
- In the project list, select the project that you want to delete, and then click Delete.
- In the dialog, type the project ID, and then click Shut down to delete the project.