The following sections describe how to import data into and export data from managed tables using BigLake tables for Apache Iceberg in BigQuery (hereafter BigLake Iceberg tables in BigQuery).
Before you begin
Understand the different types of BigLake tables and the implications of using them, in the BigLake table overview.
Before importing and exporting BigLake Iceberg tables in BigQuery data, ensure that you have set up a Cloud resource connection to a storage bucket. Your connection needs write permissions on the storage bucket, as specified in the following Required roles section. For more information about required roles and permissions for connections, see Manage connections.
Required roles
To get the permissions that you need to let BigQuery manage tables in your project, ask your administrator to grant you the following IAM roles:
-
To create BigLake Iceberg tables in BigQuery:
-
BigQuery Data Owner (
roles/bigquery.dataOwner) on your project -
BigQuery Connection Admin (
roles/bigquery.connectionAdmin) on your project
-
BigQuery Data Owner (
-
Grant the connection service account the following roles so it can read and write data in Cloud Storage:
-
Storage Object User (
roles/storage.objectUser) on the bucket -
Storage Legacy Bucket Reader (
roles/storage.legacyBucketReader) on the bucket
-
Storage Object User (
For more information about granting roles, see Manage access to projects, folders, and organizations.
These predefined roles contain the permissions required to let BigQuery manage tables in your project. To see the exact permissions that are required, expand the Required permissions section:
Required permissions
The following permissions are required to let BigQuery manage tables in your project:
-
bigquery.connections.delegateon your project -
bigquery.jobs.createon your project -
bigquery.readsessions.createon your project -
bigquery.tables.createon your project -
bigquery.tables.geton your project -
bigquery.tables.getDataon your project -
storage.buckets.geton your bucket -
storage.objects.createon your bucket -
storage.objects.deleteon your bucket -
storage.objects.geton your bucket -
storage.objects.liston your bucket
You might also be able to get these permissions with custom roles or other predefined roles.
Import data into BigLake Iceberg tables in BigQuery
The following sections describe how to import data from various table formats into BigLake Iceberg tables in BigQuery.
Standard load data from flat files
BigLake Iceberg tables in BigQuery use BigQuery load jobs to
load external files
into BigLake Iceberg tables in BigQuery. If you have an existing
BigLake Iceberg table in BigQuery, follow
the bq load CLI guide
or the
LOAD SQL guide
to load external data. After loading the data, new Parquet files are written
into the STORAGE_URI/data folder.
If the prior instructions are used without an existing BigLake Iceberg table in BigQuery, a BigQuery table is created instead.
See the following for tool-specific examples of batch loads into managed tables:
SQL
LOAD DATA INTO MANAGED_TABLE_NAME FROM FILES ( uris=['STORAGE_URI'], format='FILE_FORMAT');
Replace the following:
- MANAGED_TABLE_NAME: the name of an existing BigLake Iceberg table in BigQuery.
- STORAGE_URI: a fully qualified
Cloud Storage URI
or a comma-separated list of URIs.
Wildcards
are also supported. For example,
gs://mybucket/table. - FILE_FORMAT: the source table format. For supported formats,
see the
formatrow ofload_option_list.
bq
bq load \ --source_format=FILE_FORMAT \ MANAGED_TABLE \ STORAGE_URI
Replace the following:
- FILE_FORMAT: the source table format. For supported formats,
see the
formatrow ofload_option_list. - MANAGED_TABLE_NAME: the name of an existing BigLake Iceberg table in BigQuery.
- STORAGE_URI: a fully qualified
Cloud Storage URI
or a comma-separated list of URIs.
Wildcards
are also supported. For example,
gs://mybucket/table.
Standard load from Hive-partitioned files
You can load Hive-partitioned files into BigLake Iceberg tables in BigQuery using standard BigQuery load jobs. For more information, see Loading externally partitioned data.
Load streaming data from Pub/Sub
You can load streaming data into BigLake Iceberg tables in BigQuery by using a Pub/Sub BigQuery subscription.
Export data from BigLake Iceberg tables in BigQuery
The following sections describe how to export data from BigLake Iceberg tables in BigQuery into various table formats.
Export data into flat formats
To export a BigLake Iceberg table in BigQuery into a flat format, use the
EXPORT DATA statement
and select a destination format. For more information, see
Exporting data.