Pipe query syntax

Pipe query syntax is an extension to GoogleSQL that's simpler and more concise than standard query syntax. Pipe syntax supports the same operations as standard syntax, and improves some areas of SQL query functionality and usability.

For more background and details on pipe syntax design, see the research paper SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL.

Pipe syntax

Pipe syntax has the following key characteristics:

  • Each pipe operator in pipe syntax consists of the pipe symbol, |>, an operator name, and any arguments:
    |> operator_name argument_list
  • Pipe operators can be added to the end of any valid query.
  • Pipe syntax works anywhere standard syntax is supported: in queries, views, table-valued functions (TVFs), and other contexts.
  • Pipe syntax can be mixed with standard syntax in the same query. For example, subqueries can use different syntax from the parent query.
  • A pipe operator can see every alias that exists in the table preceding the pipe.
  • A query can start with a FROM clause, and pipe operators can optionally be added after the FROM clause.

Query comparison

Consider the following table called example_table:

SELECT * FROM example_table;

/*---------------------------------------------------+
 | _key | f1               | f2                      |
 +---------------------------------------------------+
 | a#01 | {                | {                       |
 |      |   "c1": "xyz",   |   "col1": "abc",        |
 |      |   "c2": "nop"    |   "col2": 125           |
 |      | }                | }                       |
 +---------------------------------------------------+
 | a#02 | {                | {                       |
 |      |   "c1": "xyz"    |   "col1": "cba",        |
 |      | }                |   "col2": 123           |
 |      |                  | }                       |
 +---------------------------------------------------+
 | b#01 | {                | {                       |
 |      |   "c1": "jkl",   |   "col1": "qrs",        |
 |      |   "c2": "mno"    |   "col2": 654           |
 |      | }                | }                       |
 +---------------------------------------------------+
 | b#02 | {                | {                       |
 |      |   "c1": "xyz",   |   "col1": "abc"         |
 |      |   "c2": "no"     | }                       |
 |      | }                |                         |
 +---------------------------------------------------*/

Compare the following equivalent queries that compute the number and total amount of sales for each item in the example_table table:

Standard syntax

SELECT f2['col1'] AS col1, COUNT(*) AS num_items
FROM example_table
WHERE f1['c1'] = 'xyz'
GROUP BY col1
ORDER BY num_items DESC;

/*--------+-----------+
 | col1   | num_items |
 +--------+-----------+
 | abc    | 2         |
 | cba    | 1         |
 +--------+-----------*/

Pipe syntax

FROM example_table
|> WHERE f1['c1'] = 'xyz'
|> AGGREGATE COUNT(*) AS num_items GROUP BY f2['col1'] AS col1
|> ORDER BY num_items DESC;

/*--------+-----------+
 | col1   | num_items |
 +--------+-----------+
 | abc    | 2         |
 | cba    | 1         |
 +--------+-----------*/

Pipe operator semantics

Pipe operators have the following semantic behavior:

  • Each pipe operator performs a self-contained operation.
  • A pipe operator consumes the input table passed to it through the pipe symbol, |>, and produces a new table as output.
  • A pipe operator can reference only columns from its immediate input table. Columns from earlier in the same query aren't visible. Inside subqueries, correlated references to outer columns are still allowed.

FROM queries

In pipe syntax, a query can start with a standard FROM clause and use any standard FROM syntax, including tables, joins, subqueries, UNNEST operations, and table-valued functions (TVFs). Table aliases can be assigned to each input item using the AS alias clause.

A query with only a FROM clause, like FROM table_name, is allowed in pipe syntax and returns all rows from the table. For tables with columns, FROM table_name in pipe syntax is similar to SELECT * FROM table_name in standard syntax.

Example

The following query uses the example_table:

FROM example_table;

/*---------------------------------------------------+
 | _key | f1               | f2                      |
 +---------------------------------------------------+
 | a#01 | {                | {                       |
 |      |   "c1": "xyz",   |   "col1": "abc",        |
 |      |   "c2": "nop"    |   "col2": 125"          |
 |      | }                | }                       |
 +---------------------------------------------------+
 | a#02 | {                | {                       |
 |      |   "c1": "xyz"    |   "col1": "cba",        |
 |      | }                |   "col2": 123           |
 |      |                  | }                       |
 +---------------------------------------------------+
 | b#01 | {                | {                       |
 |      |   "c1": "jkl",   |   "col1": "qrs",        |
 |      |   "c2": "mno"    |   "col2": 654           |
 |      | }                | }                       |
 +---------------------------------------------------+
 | b#02 | {                | {                       |
 |      |   "c1": "xyz",   |   "col1": "abc"         |
 |      |   "c2": "no"     | }                       |
 |      | }                |                         |
 +---------------------------------------------------*/

Temporal filters and time series unpacking follow the same format as the standard syntax.

Example with temporal filtering:

FROM example_table(as_of => TIMESTAMP('2022-01-10T13:14:00.234Z'));

Example with the UNPACK function:

FROM UNPACK((SELECT * FROM example_table(with_history=>true)));

Pipe operators

GoogleSQL supports the following pipe operators. For operators that correspond or relate to similar operations in standard syntax, the operator descriptions highlight similarities and differences and link to more detailed documentation on the corresponding syntax.

Pipe operator list

Name Summary
SELECT Produces a new table with the listed columns.
EXTEND Propagates the existing table and adds computed columns.
SET Replaces the values of columns in the input table.
DROP Removes listed columns from the input table.
RENAME Renames specified columns.
AS Introduces a table alias for the input table.
WHERE Filters the results of the input table.
AGGREGATE Performs aggregation on data across groups of rows or the full input table.
ORDER BY Sorts results by a list of expressions.
LIMIT Limits the number of rows to return in a query.

SELECT pipe operator

|> SELECT expression [[AS] alias] [, ...]

Description

Produces a new table with the listed columns, similar to the outermost SELECT clause in a table subquery in standard syntax. The SELECT operator supports standard output modifiers like SELECT AS STRUCT and SELECT DISTINCT. The SELECT operator doesn't support aggregations or anonymization.

In pipe syntax, the SELECT operator in a query is optional. The SELECT operator can be used near the end of a query to specify the list of output columns. The final query result contains the columns returned from the last pipe operator. If the SELECT operator isn't used to select specific columns, the output includes the full row, similar to what the SELECT * statement in standard syntax produces.

In pipe syntax, the SELECT clause doesn't perform aggregation. Use the AGGREGATE operator instead.

For cases where SELECT would be used in standard syntax to rearrange columns, pipe syntax supports other operators:

Examples

FROM (SELECT 'apples' AS item, 2 AS sales)
|> SELECT item AS fruit_name;

/*------------+
 | fruit_name |
 +------------+
 | apples     |
 +------------*/

EXTEND pipe operator

|> EXTEND expression [[AS] alias] [, ...]

Description

Propagates the existing table and adds computed columns, similar to SELECT *, new_column in standard syntax.

Example

(
  SELECT 50 AS Fahrenheit
)
|> EXTEND (Fahrenheit - 32) * 5 / 9 AS Celsius;

/*------------+---------+
 | Fahrenheit | Celsius |
 +------------+---------+
 | 50         | 10      |
 +------------+---------*/

SET pipe operator

|> SET column = expression [, ...]

Description

Replaces the value of a column in the input table, similar to SELECT * REPLACE (expression AS column) in standard syntax. Each referenced column must exist exactly once in the input table.

After a SET operation, the referenced top-level columns (like x) are updated, but table aliases (like t) still refer to the original row values. Therefore, t.x will still refer to the original value.

Example

(
  SELECT 2 AS x, 22 AS y
)
|> SET x = x * x, y = 3;

/*---+---+
 | x | y |
 +---+---+
 | 4 | 3 |
 +---+---*/
FROM (SELECT 2 AS x, 3 AS y) AS t
|> SET x = x * x, y = 8
|> SELECT t.x AS original_x, x, y;

/*------------+---+---+
 | original_x | x | y |
 +------------+---+---+
 | 2          | 4 | 8 |
 +------------+---+---*/

DROP pipe operator

|> DROP column [, ...]

Description

Removes listed columns from the input table, similar to SELECT * EXCEPT (column) in standard syntax. Each referenced column must exist at least once in the input table.

After a DROP operation, the referenced top-level columns (like x) are removed, but table aliases (like t) still refer to the original row values. Therefore, t.x will still refer to the original value.

Example

SELECT 'apples' AS item, 2 AS sales, 'fruit' AS category
|> DROP sales, category;

/*--------+
 | item   |
 +--------+
 | apples |
 +--------*/
FROM (SELECT 1 AS x, 2 AS y) AS t
|> DROP x
|> SELECT t.x AS original_x, y;

/*------------+---+
 | original_x | y |
 +------------+---+
 | 1          | 2 |
 +------------+---*/

RENAME pipe operator

|> RENAME old_column_name [AS] new_column_name [, ...]

Description

Renames specified columns. Each column to be renamed must exist exactly once in the input table. The RENAME operator can't rename value table fields, pseudo-columns, range variables, or objects that aren't columns in the input table.

After a RENAME operation, the referenced top-level columns (like x) are renamed, but table aliases (like t) still refer to the original row values. Therefore, t.x will still refer to the original value.

Example

SELECT 1 AS x, 2 AS y, 3 AS z
|> AS t
|> RENAME y AS renamed_y
|> SELECT *, t.y AS t_y;

/*---+-----------+---+-----+
 | x | renamed_y | z | t_y |
 +---+-----------+---+-----+
 | 1 | 2         | 3 | 2   |
 +---+-----------+---+-----*/

AS pipe operator

|> AS alias

Description

Introduces a table alias for the input table, similar to applying the AS alias clause on a table subquery in standard syntax. Any existing table aliases are removed and the new alias becomes the table alias for all columns in the row.

The AS operator can be useful after operators like SELECT, EXTEND, or AGGREGATE that add columns but can't give table aliases to them.

Example

FROM UNNEST([1000, 100, 10]) AS sales
|> AGGREGATE SUM(sales) AS total_sales
|> AS t;

/*-------------------------+
 | total_sales             |
 +-------------------------+
 | 1110                    |
 +-------------------------*/

WHERE pipe operator

|> WHERE boolean_expression

Description

Filters the results of the input table. The WHERE operator behaves the same as the WHERE clause in standard syntax.

In pipe syntax, the WHERE operator also replaces the HAVING clause in standard syntax. For example, after performing aggregation with the AGGREGATE operator, use the WHERE operator instead of the HAVING clause.

Example

FROM UNNEST([1000, 100, 10]) AS sales
|> WHERE sales > 90;

/*--------+
 | sales  |
 +--------+
 | 1000   |
 | 100    |
 +--------*/

AGGREGATE pipe operator

-- Full-table aggregation
|> AGGREGATE aggregate_expression [[AS] alias] [, ...]
-- Aggregation with grouping
|> AGGREGATE [aggregate_expression [[AS] alias] [, ...]]
   GROUP BY groupable_items [[AS] alias] [, ...]
-- Aggregation with grouping and shorthand ordering syntax
|> AGGREGATE [aggregate_expression [[AS] alias] [order_suffix] [, ...]]
   GROUP [AND ORDER] BY groupable_item [[AS] alias] [order_suffix] [, ...]

order_suffix: {ASC | DESC}

Description

Performs aggregation on data across grouped rows or an entire table. The AGGREGATE operator is similar to a query in standard syntax that contains a GROUP BY clause or a SELECT list with aggregate functions or both. In pipe syntax, the GROUP BY clause is part of the AGGREGATE operator. Pipe syntax doesn't support a standalone GROUP BY operator.

Without the GROUP BY clause, the AGGREGATE operator performs full-table aggregation and produces one output row.

With the GROUP BY clause, the AGGREGATE operator performs aggregation with grouping, producing one row for each set of distinct values for the grouping expressions.

The AGGREGATE expression list corresponds to the aggregated expressions in a SELECT list in standard syntax. Each expression in the AGGREGATE list must include an aggregate function. Aggregate expressions can also include scalar expressions (for example, sqrt(SUM(x*x))). Column aliases can be assigned using the AS operator.

The GROUP BY clause in the AGGREGATE operator corresponds to the GROUP BY clause in standard syntax. Unlike in standard syntax, aliases can be assigned to GROUP BY items.

The output columns from the AGGREGATE operator include all grouping columns first, followed by all aggregate columns, using their assigned aliases as the column names.

Unlike in standard syntax, grouping expressions aren't repeated across SELECT and GROUP BY clauses. In pipe syntax, the grouping expressions are listed once, in the GROUP BY clause, and are automatically included as output columns for the AGGREGATE operator.

Because output columns are fully specified by the AGGREGATE operator, the SELECT operator isn't needed after the AGGREGATE operator unless you want to produce a list of columns different from the default.

Standard syntax

-- Aggregation in standard syntax
SELECT SUM(col1) AS total, col2, col3, col4...
FROM table1
GROUP BY col2, col3, col4...

Pipe syntax

-- The same aggregation in pipe syntax
FROM table1
|> AGGREGATE SUM(col1) AS total
   GROUP BY col2, col3, col4...

Examples

-- Full-table aggregation
FROM
  UNNEST(
    [
      STRUCT('apples' AS item, 2 AS sales),
      STRUCT('bananas' AS item, 5 AS sales),
      STRUCT('apples' AS item, 7 AS sales)]) AS t
|> AGGREGATE COUNT(*) AS num_items, SUM(t.sales) AS total_sales;

/*-----------+-------------+
 | num_items | total_sales |
 +-----------+-------------+
 | 3         | 14          |
 +-----------+-------------*/
-- Aggregation with grouping
FROM
  UNNEST(
    [
      STRUCT('apples' AS item, 2 AS sales),
      STRUCT('bananas' AS item, 5 AS sales),
      STRUCT('apples' AS item, 7 AS sales)]) AS t
|> AGGREGATE COUNT(*) AS num_items, SUM(t.sales) AS total_sales GROUP BY t.item;

/*---------+-----------+-------------+
 | item    | num_items | total_sales |
 +---------+-----------+-------------+
 | apples  | 2         | 9           |
 | bananas | 1         | 5           |
 +---------+-----------+-------------*/

Shorthand ordering syntax with AGGREGATE

The AGGREGATE operator supports a shorthand ordering syntax, which is equivalent to applying the ORDER BY operator as part of the AGGREGATE operator without repeating the column list:

-- Aggregation with grouping and shorthand ordering syntax
|> AGGREGATE [aggregate_expression [[AS] alias] [order_suffix] [, ...]]
   GROUP [AND ORDER] BY groupable_item [[AS] alias] [order_suffix] [, ...]

order_suffix: {ASC | DESC}

The GROUP AND ORDER BY clause is equivalent to an ORDER BY clause on all groupable_items. By default, each groupable_item is sorted in ascending order with NULL values first. Other ordering suffixes like DESC can be used for other orders.

Without the GROUP AND ORDER BY clause, the ASC or DESC suffixes can be added on individual columns in the GROUP BY list or AGGREGATE list or both.

Adding these suffixes is equivalent to adding an ORDER BY clause that includes all of the suffixed columns with the suffixed grouping columns first, matching the left-to-right output column order.

Examples

Consider the following table called example_table:

/*---------------------------------------------------+
 | _key | f1               | f2                      |
 +---------------------------------------------------+
 | a#01 | {                | {                       |
 |      |   "c1": "xyz",   |   "col1": "abc",        |
 |      |   "c2": "nop"    |   "col2": 125"          |
 |      | }                | }                       |
 +---------------------------------------------------+
 | a#02 | {                | {                       |
 |      |   "c1": "xyz"    |   "col1": "cba",        |
 |      | }                |   "col2": 123           |
 |      |                  | }                       |
 +---------------------------------------------------+
 | b#01 | {                | {                       |
 |      |   "c1": "jkl",   |   "col1": "qrs",        |
 |      |   "c2": "mno"    |   "col2": 654           |
 |      | }                | }                       |
 +---------------------------------------------------+
 | b#02 | {                | {                       |
 |      |   "c1": "xyz",   |   "col1": "abc"         |
 |      |   "c2": "no"     | }                       |
 |      | }                |                         |
 +---------------------------------------------------*/

The following two equivalent examples show you how to order by all grouping columns using the GROUP AND ORDER BY clause or a separate ORDER BY clause:

-- Order by all grouping columns using GROUP AND ORDER BY.
FROM example_table
|> AGGREGATE COUNT(*) AS num_items
   GROUP AND ORDER BY f1['c1'] AS c1, f2['col1'] AS col1 DESC;

/*------------+-----------+
 | c1  | col1 | num_items |
 +-----------+------------+
 | jkl | qrs  | 1         |
 | xyz | cba  | 1         |
 | xyz | abc  | 2         |
 +-----------+------------*/
-- Order by columns using ORDER BY after performing aggregation.
FROM example_table
|> AGGREGATE COUNT(*) AS num_items
   GROUP BY f1['c1'] AS c1, f2['col1'] AS col1
|> ORDER BY c1, col1 DESC;

You can add an ordering suffix to a column in the AGGREGATE list. Although the AGGREGATE list appears before the GROUP BY list in the query, ordering suffixes on columns in the GROUP BY list are applied first.

FROM example_table
|> AGGREGATE COUNT(*) AS num_items ASC
   GROUP BY f2['col1'] AS col1, f1['c1'] AS c1 DESC;

/*------------+-----------+
 | col1 | c1  | num_items |
 +------------+-----------+
 | abc  | xyz | 1         |
 | cba  | xyz | 2         |
 | qrs  | jkl | 1         |
 +-----------+------------*/

The previous query is equivalent to the following:

-- Order by specified grouping and aggregate columns.
FROM example_table
|> AGGREGATE COUNT(*) AS num_items
   GROUP BY f1['c1'] AS c1, f2['col1'] AS col1
|> ORDER BY c1 DESC, num_items;

ORDER BY pipe operator

|> ORDER BY expression [sort_options] [, ...]

Description

Sorts results by a list of expressions. The ORDER BY operator behaves the same as the ORDER BY clause in standard syntax. Suffixes like DESC are supported for customizing the ordering for each expression.

In pipe syntax, the AGGREGATE operator also supports shorthand ordering suffixes to apply ORDER BY behavior more concisely as part of aggregation.

Example

FROM UNNEST([1000, 100, 10]) AS sales
|> ORDER BY sales DESC;

/*-------+
 | sales |
 +-------+
 | 1000  |
 | 100   |
 | 10    |
 +-------*/

LIMIT pipe operator

|> LIMIT count

Description

Limits the number of rows to return in a query. The LIMIT operator behaves the same as the LIMIT clause in standard syntax.

Example

FROM UNNEST([1000, 100, 10]) AS sales
|> ORDER BY sales DESC
|> LIMIT 1;

/*---------+
 | sales   |
 +---------+
 | 1000    |
 +---------*/