This document describes the different table formats available when building a lakehouse on Google Cloud and helps you choose the right one for your needs.
When building a lakehouse using Google Cloud Lakehouse, you can choose from several table formats that offer different levels of management, performance, and interoperability. Your choice depends on where your data originates, which engines you want to use for writing and transformation, and how much control you need over storage and metadata.
Table formats
When building a Google Cloud Lakehouse, you have the following choices for the format of your tables, categorized by the catalog that manages them:
Lakehouse runtime catalog tables
Recommended
The Lakehouse runtime catalog supports open compatibility and management for Apache Iceberg tables.
- Lakehouse Iceberg REST catalog tables: These are Apache Iceberg tables that you create from open source engines and store in Cloud Storage. They offer open compatibility and Read/Write interoperability between BigQuery and Iceberg-compatible engines. This option is best if you want your ETL workflow to be managed by open source engines.
BigQuery catalog tables
The BigQuery catalog manages native tables, Apache Iceberg tables, and external tables.
Apache Iceberg tables: These are Apache Iceberg tables that you create and manage from BigQuery and store in Cloud Storage. While they can be read by open source engines, BigQuery is the engine that manages the metadata and writes to them. This option is best if you want your workflow to be fully managed by BigQuery.
Native tables: These are native BigQuery tables. They are fully managed and offer the most advanced analytics and management features. This option is best for non-Iceberg workloads.
External tables: These tables are BigQuery-specific constructs for data stored in Cloud Storage, Amazon S3, or Azure Blob Storage. The data and metadata are self-managed, and BigQuery only has read access. Choose this option for data you want to manage in a third-party catalog or storage directly.
Use the following chart to compare your table format options:
| External tables | Lakehouse Iceberg REST catalog tables | Apache Iceberg tables | Standard BigQuery tables | |
|---|---|---|---|---|
| Metastore | External or self-hosted metastore | Lakehouse runtime catalog | BigQuery catalog | BigQuery catalog |
| Storage | Cloud Storage / Amazon S3 / Azure | Cloud Storage | Cloud Storage | BigQuery |
| Storage optimization | Customer or third-party managed | Customer or third-party managed | Google managed | Google managed |
| Read / Write |
Open source engines (read/write) BigQuery (read only) |
Open source engines (read/write) BigQuery (read only) |
Open source engines (read only with Iceberg
libraries, read/write interoperability with BigQuery Storage API)
BigQuery (read/write) |
Open source engines (read/write interoperability with
BigQuery Storage API) BigQuery (read/write) |
| Use cases | Staging tables for BigQuery loads, legacy query-only tables | Open lakehouse | Open lakehouse with high-performant, enterprise-grade storage for advanced analytics, streaming, and AI | Enterprise-grade storage for advanced analytics, streaming, and AI |