As of April 20th, 2026, BigLake is now called Lakehouse for Apache Iceberg. BigLake metastore is now called the Lakehouse runtime catalog. Lakehouse APIs, client libraries, CLI commands, and IAM names remain unchanged and still reference BigLake.

What is Lakehouse for Apache Iceberg?

Lakehouse for Apache Iceberg is a high-performance storage engine designed for building open data lakehouses. By integrating the Apache Iceberg open table format with fully managed, enterprise-grade storage on Google Cloud, it provides a unified interface for advanced analytics and AI.

To manage open table metadata, Lakehouse for Apache Iceberg uses the Lakehouse runtime catalog. This fully managed, serverless metadata service provides a single source of truth across disparate systems, centralizing discovery and removing the need to synchronize metadata between different repositories.

By decoupling storage from compute, Google Cloud's Lakehouse ensures seamless interoperability across analytical and transactional systems. This architecture allows multiple engines—including Apache Spark, Apache Flink, Apache Hive, Trino, and BigQuery—to access a single source of truth, eliminating data duplication and ensuring consistent insights.

Key benefits

Serverless architecture: Google Cloud's Lakehouse eliminates the need for server or cluster management, reducing operational overhead and automatically scaling based on demand. For compute workloads, serverless batch and interactive sessions remove resource contention between jobs and automate infrastructure maintenance.
Unified data management and governance: Integration with Knowledge Catalog ensures central definition and enforcement of governance policies across multiple engines, and enables semantic search, data lineage, and quality checks.
Storage extensions: Google Cloud's Lakehouse extends Cloud Storage management capabilities to include features such as Autoclass tiering and Customer-managed encryption keys (CMEK).
Fully managed experience: When integrated with BigQuery, Google Cloud's Lakehouse uses high-throughput streaming and real-time metadata management to provide a fully managed streaming, analytics, and AI experience.
High availability and disaster recovery: Google Cloud's Lakehouse offers options for cross-region replication and disaster recovery (Preview) to support high availability of your data.

Use cases

Open lakehouse: Use Cloud Storage as the storage layer, and Google Cloud's Lakehouse provides the management and governance interface for Apache Iceberg data.
Analytical and transactional integration: Access analytical Apache Iceberg tables directly within AlloyDB for PostgreSQL (Preview) to combine analytical data with transactional workloads.
Unified access: Let different engines (Apache Spark, Apache Flink, BigQuery) interact with the same Apache Iceberg tables with consistent metadata.
Cross-cloud analytics and AI: Use cross-cloud Lakehouse (Preview) to synchronize metadata from other cloud providers, allowing you to query data with BigQuery or external open source engines through the Apache Iceberg REST catalog endpoint, all without migrating the data.
Public dataset exploration: Easily query high-quality public datasets using the Apache Iceberg REST catalog endpoint without managing infrastructure.
Hive Metastore Connect open-source engines such as Apache Spark and Apache Hive to the Lakehouse runtime catalog using the Hive catalog (Preview). This eliminates the operational overhead of maintaining a self-hosted Hive Metastore (HMS) while enabling seamless data sharing and direct table queries in BigQuery.

Interfaces and tools

You can interact with Google Cloud's Lakehouse resources using the following tools:

Google Cloud console: Use the console to create catalogs, view catalog properties, view audit logs, and configure permissions.
BigQuery SQL: Use standard SQL DDL (Data Definition Language) to create and manage Apache Iceberg tables and external tables integrated with the Lakehouse runtime catalog.
Open source engines: Use engines such as Apache Spark, Apache Flink, and Apache Hive with the Lakehouse runtime catalog to read and write data.
IDEs and notebooks: Use interactive Apache Spark notebooks and IDE extensions, such as the Data Agent Kit (DAK) extension for VS Code, to authenticate to Google Cloud, author code interactively, and manage notebook sessions directly within your development environment.
Orchestration and MLOps tools: Integrate serverless batch pipelines and catalog operations with orchestration workflows using Managed Service for Apache Airflow (formerly Cloud Composer) and Kubeflow Pipelines in Vertex AI.
Lakehouse runtime catalog API: Use the Apache Iceberg REST catalog endpoint to interact with the service using tools that are compatible with the open Apache Iceberg REST specification.

What's next

Understand the architecture of Google Cloud's Lakehouse.

What is Lakehouse for Apache Iceberg? Stay organized with collections Save and categorize content based on your preferences.

Key benefits

Use cases

Interfaces and tools

What's next

What is Lakehouse for Apache Iceberg?