For a conceptual overview of the AlloyDB columnar engine, see AlloyDB Omni columnar engine overview.
Enable the columnar engine
To use columnar engine on an instance, set the instance's
google_columnar_engine.enabled
flag to on.
Kubernetes
To set the google_columnar_engine.enabled flag to on, modify your database cluster manifest to add the parameters attribute to the primarySpec section:
apiVersion: alloydbomni.dbadmin.goog/v1
kind: DBCluster
metadata:
name: CLUSTER_NAME
spec:
databaseVersion: "17.5.0"
primarySpec:
parameters:
google_columnar_engine.enabled: "on"
Replace CLUSTER_NAME with the name of your database cluster. It is the same database cluster name you declared when you created it.
Configure the size of the column store
While the columnar engine is enabled on an instance, AlloyDB Omni allocates a portion of the instance's memory to store its columnar data. Dedicating high-speed RAM to your column store verifies that AlloyDB Omni can access the columnar data as rapidly as possible.
Memory and storage cache together represent the overall capacity of the columnar engine.
Configure memory
You can set the allocation to a fixed size using the
google_columnar_engine.memory_size_in_mb flag.
Kubernetes
To set the google_columnar_engine.memory_size_in_mb flag, modify your database cluster manifest to add the parameters attribute to the primarySpec section:
apiVersion: alloydbomni.dbadmin.goog/v1
kind: DBCluster
metadata:
name: CLUSTER_NAME
spec:
databaseVersion: "17.5.0"
primarySpec:
parameters:
google_columnar_engine.memory_size_in_mb: "COLUMN_MEMORY_SIZE"
Replace the following:
CLUSTER_NAME: the name of your database cluster. It is the same database cluster name you declared when you created it.COLUMN_MEMORY_SIZE: the new size of the column storage, in megabytes—for example,256.
Configure storage cache
You can configure the columnar engine storage cache on either shared or dedicated devices.
Kubernetes
Shared devices
To enable storage cache for your database on shared devices, modify your database cluster manifest to add the columnarSpillToDisk attribute to the features section of the primarySpec section:
apiVersion: alloydbomni.dbadmin.goog/v1
kind: DBCluster
metadata:
name: CLUSTER_NAME
spec:
databaseVersion: "17.5.0"
primarySpec:
features:
columnarSpillToDisk:
cacheSize: STORAGE_CACHE_SIZE
ultraFastCache:
cacheSize: ULTRAFAST_CACHE_SIZE
genericVolume:
storageClass: "STORAGE_CLASS_NAME"
...
Replace the following:
CLUSTER_NAME: the name of your database cluster. It is the same database cluster name you declared when you created it.STORAGE_CACHE_SIZE: the size of the columnar storage cache—for example,5Gi. If you don't specify a value for this field, 5% of the disk cache gets allocated to the columnar engine by default.ULTRAFAST_CACHE_SIZE: the size of the cache—for example,100Gi. It must be greater thanshared_buffers. This field is optional. If you don't specify the value of this field, AlloyDB Omni uses all space left on the disk, which applies to both AlloyDB Omni in a container and on a Kubernetes cluster. For more information about units of measure, see Memory resource units.STORAGE_CLASS_NAME: the name of the storage class of the ultra fast cache volume—for example,local-storage.
Dedicated devices
By default, the columnar engine's storage cache shares the same devices as the AlloyDB Omni disk cache. However, you can configure the columnar engine to use its own dedicated devices for its storage cache because of the following:
- You don't need a disk cache because your primary storage is already on high-performance SSDs. In this scenario, you can create a columnar engine storage cache without having to allocate any space for a disk cache.
- You want to use different storage media for your disk cache and your columnar engine cache. For example, you might want to use a standard SSD for your disk cache and a higher-performance NVMe SSD for your columnar engine cache.
Use a generic volume
To configure a dedicated device for the columnar engine storage cache, you modify your DBCluster manifest to add the columnarSpillToDisk attribute to the features section. Within columnarSpillToDisk, you can then specify a genericVolume with a storageClass that points to the storage you want to use for the dedicated columnar engine cache.
An example of how to configure a 50Gi dedicated device for the columnar engine storage cache using a storage class named local-ssd is as follows:
apiVersion: alloydbomni.dbadmin.goog/v1
kind: DBCluster
metadata:
name: CLUSTER_NAME
spec:
databaseVersion: "17.5.0"
primarySpec:
features:
columnarSpillToDisk:
cacheSize: STORAGE_CACHE_SIZE
genericVolume:
storageClass: "STORAGE_CLASS_NAME"
Replace the following:
CLUSTER_NAME: the name of your database cluster. It is the same database cluster name you declared when you created it.STORAGE_CACHE_SIZE: the size of the columnar storage cache—for example,50Gi. If you don't specify a value for this field, 5% of the disk cache gets allocated to the columnar engine by default.STORAGE_CLASS_NAME: the name of the storage class for the dedicated columnar engine cache volume—for example,local-ssd.
Use an ephemeral volume
You can configure the columnar engine to use an ephemeral emptyDir volume for its storage cache. An emptyDir volume is created when a Pod is assigned to a node, and it exists as long as that Pod runs on that node. When a Pod is removed from a node, the data in the emptyDir is deleted permanently.
To configure an emptyDir volume for the columnar storage cache, modify your DBCluster manifest to add the emptyDir attribute to the columnarSpillToDisk section.
Here is an example of how to configure an ephemeral volume for the columnar storage cache:
apiVersion: alloydbomni.dbadmin.goog/v1
kind: DBCluster
metadata:
name: CLUSTER_NAME
spec:
databaseVersion: "17.5.0"
primarySpec:
features:
columnarSpillToDisk:
cacheSize: 50Gi
emptyDir: {}
Replace the following:
CLUSTER_NAME: the name of your database cluster. It is the same database cluster name you declared when you created it.STORAGE_CACHE_SIZE: the size of the columnar storage cache—for example,50Gi.
Enable vectorized join
The columnar engine has a vectorized join feature that can improve the performance of joins by applying vectorized processing to qualifying queries.
After you enable vectorized join, the AlloyDB query planner has the option to apply the vectorized join operator instead of the standard PostgreSQL hash join operator. The planner makes this decision by comparing the cost of executing the query using each method.
To enable vectorized join on an instance, set the instance's
google_columnar_engine.enable_vectorized_join
flag to on.
To set this flag on an instance, run the ALTER SYSTEM PostgreSQL command:
ALTER SYSTEM SET google_columnar_engine.enable_vectorized_join = 'on';
AlloyDB Omni allocates one thread to the vectorized join
feature by default. You can increase the number of threads available to
this feature by setting the
google_columnar_engine.vectorized_join_threads
flag to a larger value. The maximum value is cpu_count * 2.
Manually refresh your columnar engine
By default, when the columnar engine is enabled it refreshes the column store in the background.
To manually refresh the column engine, run the following SQL query:
SELECT google_columnar_engine_refresh(relation =>'TABLE_NAME');
Replace TABLE_NAME with the name of the table or the materialized view you
want to manually refresh.
Disable the columnar engine
To disable the columbar engine on an instance, set the google_columnar_engine.enabled
flag to off.
Kubernetes
To set the google_columnar_engine.enabled flag to off, modify your database cluster manifest to add the parameters attribute to the primarySpec section:
apiVersion: alloydbomni.dbadmin.goog/v1
kind: DBCluster
metadata:
name: CLUSTER_NAME
spec:
databaseVersion: "17.5.0"
primarySpec:
parameters:
google_columnar_engine.enabled: "off"
Replace CLUSTER_NAME with the name of your database cluster. It is the same database cluster name you declared when you created it.
What's next
Work through the Accelerating analytical queries with columnar engine in AlloyDB Omni Google CodeLab tutorial.