Graph schema best practices

This document provides best practices for defining your graph schema to improve your graph query performance.

Scope your property definitions

Properties are key-value pairs that provide additional information attached to nodes or edges. We recommend that you only include necessary properties in nodes or edges, and avoid using the PROPERTIES ALL COLUMNS syntax or the default syntax that attaches all columns from the node or edge tables to the property list. Having many properties in nodes or edges might cause unnecessary column scans in graph queries, which degrades performance.

To restrict the properties that you include in a node or edge definition, use the PROPERTIES keyword when you define element properties in your CREATE PROPERTY GRAPH statement.

The following node table definition restricts the properties for the Person node table to id and name:

NODE TABLES (
  graph_db.Person PROPERTIES (id, name)
)

Define primary and foreign key constraints on graph nodes and edges

BigQuery can use primary and foreign key constraints on your node and edge tables to optimize your graph queries by reducing unnecessary table scans. However, BigQuery doesn't enforce primary or foreign key constraints on tables. If your application can't guarantee referential integrity or uniqueness on primary keys, then using primary or foreign keys for query optimization might lead to incorrect query results.

The following example defines primary and foreign key constraints on the node tables Person and Account, and the edge table PersonOwnAccount:

CREATE OR REPLACE TABLE graph_db.Person (
  id               INT64,
  name             STRING,
  birthday         TIMESTAMP,
  country          STRING,
  city             STRING,
  PRIMARY KEY (id) NOT ENFORCED
);

CREATE OR REPLACE TABLE graph_db.Account (
  id               INT64,
  create_time      TIMESTAMP,
  is_blocked       BOOL,
  nick_name        STRING,
  PRIMARY KEY (id) NOT ENFORCED
);

CREATE OR REPLACE TABLE graph_db.PersonOwnAccount (
  id               INT64 NOT NULL,
  account_id       INT64 NOT NULL,
  create_time      TIMESTAMP,
  PRIMARY KEY (id, account_id) NOT ENFORCED,
  FOREIGN KEY (id) references graph_db.Person(id) NOT ENFORCED,
  FOREIGN KEY (account_id) references graph_db.Account(id) NOT ENFORCED
);

What's next