cluster_keys

Usage

view: view_name {
  derived_table: {
    cluster_keys: ["customer_city", "customer_state"]
    ...
  }
}

Hierarchy

View File

view

derived_table

cluster_keys

- or -

Model File

explore

aggregate_table

materialization

cluster_keys

Default Value

None

Accepts

One or more clustered column names

Special Rules

cluster_keys is supported only on specific dialects

Definition

Clustering a partitioned table sorts the data in a partition based on the values in the clustered columns and organizes the clustered columns in optimally sized storage blocks. Clustering can improve the performance and reduce the cost of queries that filter on or aggregate by the clustered columns.

See the Dialect support for cluster_keys section for the list of dialects that support cluster_keys.

To add a clustered column to a persistent derived table (PDT) or an aggregate table, use the cluster_keys parameter and supply the names of the columns you want clustered in the database table.

Examples

Create a customer_order_facts native derived table on a BigQuery database, partitioned on the date column and clustered on the city, age_tier, and gender columns to optimize queries that are filtered or aggregated on those columns:

view: customer_order_facts {
  derived_table: {
    explore_source: order {
      column: customer_id { field: order.customer_id }
      column: date { field: order.order_time }
      column: city { field: users.city}
      column: age_tier { field: users.age_tier }
      column: gender { field: users.gender }
      derived_column: num_orders {
        sql: COUNT(order.customer_id) ;;
      }
    }
    partition_keys: [ "date" ]
    cluster_keys: [ "city", "age_tier", "gender" ]
    datagroup_trigger: daily_datagroup
  }
}

Dialect support for `cluster_keys`

The ability to use cluster_keys depends on the database dialect your Looker connection is using. In the latest release of Looker, the following dialects support cluster_keys:

Dialect	Supported?
Actian Avalanche
Amazon Athena
Amazon Aurora MySQL
Amazon Redshift
Amazon Redshift 2.1+
Amazon Redshift Serverless 2.1+
Apache Druid
Apache Druid 0.13.x - 0.17.x
Apache Druid 0.18+
Apache Hive 2.3+
Apache Hive 3.1.2+
Apache Spark 3+
ClickHouse
Cloudera Impala 3.1+
Cloudera Impala 3.1+ with Native Driver
Cloudera Impala with Native Driver
DataVirtuality
Databricks
Denodo 7
Denodo 8 & 9
Dremio
Dremio 11+
Exasol
Google BigQuery Legacy SQL
Google BigQuery Standard SQL
Google Cloud AlloyDB for PostgreSQL
Google Cloud PostgreSQL
Google Cloud SQL
Google Spanner
Greenplum
HyperSQL
IBM Netezza
MariaDB
Microsoft Azure PostgreSQL
Microsoft Azure SQL Database
Microsoft Azure Synapse Analytics
Microsoft SQL Server 2008+
Microsoft SQL Server 2012+
Microsoft SQL Server 2016
Microsoft SQL Server 2017+
MongoBI
MySQL
MySQL 8.0.12+
Oracle
Oracle ADWC
PostgreSQL 9.5+
PostgreSQL pre-9.5
PrestoDB
PrestoSQL
SAP HANA
SAP HANA 2+
SingleStore
SingleStore 7+
Snowflake
Teradata
Trino
Vector
Vertica