Lakehouse table overview

Lakehouse for Apache Iceberg supports multiple table types, offering different levels of management, performance, and interoperability for your lakehouse on Google Cloud. Based on your data origin, write engine requirements, and control needs, you can choose table formats supported by either the Lakehouse runtime catalog or BigQuery.

Supported by the Lakehouse runtime catalog

Recommended

The Lakehouse runtime catalog supports Apache Iceberg tables.

  • Apache Iceberg tables: These are Apache Iceberg tables that you create from open source engines and store in Cloud Storage. The Lakehouse runtime catalog manages tables through the Lakehouse runtime catalog Iceberg REST endpoint, or you can use BigQuery or other Iceberg-compatible engines. This option is best if you want your ETL workflow to be managed by open source engines.

    The Lakehouse runtime catalog Iceberg REST endpoint provides a standard REST interface for wide compatibility with open-source engines like Apache Spark, Apache Flink, and Trino.

Key features of these Apache Iceberg tables include:

  • Metastore: Lakehouse runtime catalog.
  • Storage: Cloud Storage.
  • Storage optimization: Managed by you or a third party.
  • Read and write access:
    • Open source engines: Read and write.
    • BigQuery: Read only.
  • Use cases: Open lakehouse with high-performance, enterprise-grade storage for advanced analytics, streaming, and AI.

Supported by BigQuery

BigQuery supports Apache Iceberg tables, native tables, and external tables.

  • Apache Iceberg tables: These are Apache Iceberg tables that you create and manage from BigQuery and store in Cloud Storage. While they can be read by open source engines, BigQuery is the engine that manages the metadata and writes to them. This option is best if you want your workflow to be fully managed by BigQuery.

  • Native tables: These are native BigQuery tables. They are fully managed and offer the most advanced analytics and management features. This option is best for non-Iceberg workloads.

  • External tables: These tables are BigQuery-specific constructs for data stored in Cloud Storage, Amazon S3, or Azure Blob Storage. The data and metadata are self-managed, and BigQuery only has read access. Choose this option for data you want to manage in a third-party catalog or storage directly.

Use the following chart to compare table types:

Apache Iceberg tables External tables Standard BigQuery tables
Metastore Lakehouse runtime catalog BigQuery External or self-hosted metastore BigQuery
Storage Cloud Storage Cloud Storage Cloud Storage / Amazon S3 / Azure BigQuery
Storage optimization Customer or third-party managed Google managed Customer or third-party managed Google managed
Read / Write Open source engines (read/write)

BigQuery (read only)
Open source engines (read only with Iceberg libraries, read/write interoperability with BigQuery Storage API)

BigQuery (read/write)

Open source engines (read/write)

BigQuery (read only)
Open source engines (read/write interoperability with BigQuery Storage API)

BigQuery (read/write)

Use cases Open lakehouse Open lakehouse with high-performant, enterprise-grade storage for advanced analytics, streaming, and AI Staging tables for BigQuery loads, legacy query-only tables Enterprise-grade storage for advanced analytics, streaming, and AI