管道查詢語法是 GoogleSQL 的擴充功能,可支援線性查詢結構,讓查詢更容易讀取、寫入及維護。您可以在任何 GoogleSQL 編寫處使用管道語法。
管道語法支援與現有 GoogleSQL 查詢語法或標準語法相同的作業,例如選取、匯總和分組、彙整和篩選,但這些作業可以任意順序和次數套用。管道語法的線性結構可讓您編寫查詢,讓查詢語法的順序與建立結果表所採取的邏輯步驟順序相符。
使用管道符號語法的查詢,其定價、執行和最佳化方式與對應的標準語法查詢相同。使用管道語法撰寫查詢時,請遵循指南估算費用,並最佳化查詢運算。
標準語法會發生問題,導致難以讀取、編寫及維護。下表說明管道語法如何解決這些問題:
標準語法 | 管道語法 |
---|---|
子句必須依特定順序顯示。 | 管道運算子可依任何順序套用。 |
較複雜的查詢 (例如含有多層匯總的查詢) 通常需要 CTE 或巢狀子查詢。 | 更複雜的查詢通常會在查詢結尾加上管道運算子來表示。 |
匯總期間,資料欄會重複出現在 SELECT 、GROUP BY 和 ORDER BY 子句中。 |
每個匯總只能列出一個資料欄。 |
如要逐步使用管道語法建立複雜的查詢,請參閱「使用管道語法分析資料」。如需完整語法詳細資訊,請參閱 Pipe 查詢語法參考文件。
基本語法
在管道語法中,查詢會以標準 SQL 查詢或 FROM
子句開頭。舉例來說,獨立的 FROM
子句 (例如 FROM MyTable
) 是有效的管道語法。標準 SQL 查詢的結果或 FROM
子句的資料表,可做為輸入內容傳遞至管道符號 |>
,後面接著管道運算子名稱和該運算子的任何引數。管道運算子會以某種方式轉換資料表,而轉換結果可傳遞至另一個管道運算子。
您可以在查詢中使用任意數量的管道運算子,執行選取、排序、篩選、彙整或彙整資料欄等作業。管道運算子的名稱與標準語法相對應,通常具有相同的行為。標準語法和管道符號語法的最大差異在於查詢結構。隨著查詢所表示的邏輯變得越來越複雜,查詢仍可表示為管道運算子的線性序列,而無需使用深層巢狀子查詢,讓查詢更容易閱讀及理解。
管道語法具有下列主要特性:
- 管道語法中的每個管道運算子都包含管道符號
|>
、運算子名稱和任何引數:
|> operator_name argument_list
- 管道運算子可新增至任何有效查詢的結尾。
- 管道運算子可按任意順序套用,且可套用多次。
- 管道語法適用於支援標準語法的任何位置,包括查詢、檢視、資料表值函式和其他情境。
- 在同一個查詢中,您可以混合使用標準語法和管道語法。舉例來說,子查詢可以使用與父項查詢不同的語法。
- 管道運算子可查看管道前方表格中的每個別名。
- 查詢可以以
FROM
子句開始,並視需要在FROM
子句後方新增管道運算子。
請參考下表:
CREATE OR REPLACE TABLE mydataset.Produce AS (
SELECT 'apples' AS item, 2 AS sales, 'fruit' AS category
UNION ALL
SELECT 'apples' AS item, 7 AS sales, 'fruit' AS category
UNION ALL
SELECT 'carrots' AS item, 0 AS sales, 'vegetable' AS category
UNION ALL
SELECT 'bananas' AS item, 15 AS sales, 'fruit' AS category);
以下查詢各自包含有效的管道語法,說明如何依序建立查詢。
查詢可以以 FROM
子句開頭,且不必包含管道符號:
-- View the table.
FROM mydataset.Produce;
/*---------+-------+-----------+
| item | sales | category |
+---------+-------+-----------+
| apples | 7 | fruit |
| apples | 2 | fruit |
| carrots | 0 | vegetable |
| bananas | 15 | fruit |
+---------+-------+-----------*/
您可以使用 WHERE
管道運算子進行篩選:
-- Filter items with no sales.
FROM mydataset.Produce
|> WHERE sales > 0;
/*---------+-------+-----------+
| item | sales | category |
+---------+-------+-----------+
| apples | 7 | fruit |
| apples | 2 | fruit |
| bananas | 15 | fruit |
+---------+-------+-----------*/
如要執行匯總作業,請使用 AGGREGATE
管道運算子,後面接著任意數量的匯總函式,然後再接著 GROUP BY
子句。GROUP BY
子句是 AGGREGATE
管道運算子的一部分,且不會以管道符號 (|>
) 分隔。
-- Compute total sales by item.
FROM mydataset.Produce
|> WHERE sales > 0
|> AGGREGATE SUM(sales) AS total_sales, COUNT(*) AS num_sales
GROUP BY item;
/*---------+-------------+-----------+
| item | total_sales | num_sales |
+---------+-------------+-----------+
| apples | 9 | 2 |
| bananas | 15 | 1 |
+---------+-------------+-----------*/
假設您有下列包含每個項目 ID 的資料表:
CREATE OR REPLACE TABLE mydataset.ItemData AS (
SELECT 'apples' AS item, '123' AS id
UNION ALL
SELECT 'bananas' AS item, '456' AS id
UNION ALL
SELECT 'carrots' AS item, '789' AS id
);
您可以使用 JOIN
管道運算子,將先前查詢的結果與這個資料表彙整,以便納入每個項目的 ID:
FROM mydataset.Produce
|> WHERE sales > 0
|> AGGREGATE SUM(sales) AS total_sales, COUNT(*) AS num_sales
GROUP BY item
|> JOIN mydataset.ItemData USING(item);
/*---------+-------------+-----------+-----+
| item | total_sales | num_sales | id |
+---------+-------------+-----------+-----+
| apples | 9 | 2 | 123 |
| bananas | 15 | 1 | 456 |
+---------+-------------+-----------+-----*/
與標準語法的重大差異
管道語法與標準語法有以下差異:
- 查詢可以以
FROM
子句開頭。 SELECT
管道運算子不會執行匯總作業。您必須改用AGGREGATE
管道運算子。- 篩選功能一律使用
WHERE
管道運算子,可套用於任何位置。WHERE
管道運算子可取代HAVING
和QUALIFY
,用於篩選匯總或窗型函式的結果。
如需更多詳細資訊,請參閱管道運算子的完整清單。
用途
管道語法的常見用途包括:
- 專案分析和逐步查詢建構:操作的邏輯順序可讓您更輕鬆地編寫及偵錯查詢。任何查詢的前置字串 (上限為管道符號
|>
) 都是有效查詢,可協助您查看長查詢中的中間結果。提升的工作效率可加快整個機構的開發流程。 - 記錄分析:記錄分析使用者常用的管道式語法還有其他類型。管道語法提供熟悉的結構,可簡化這些使用者在 Log Analytics 和 BigQuery 的導入程序。
pipe 語法中的其他功能
除了少數例外,管道語法支援標準語法在相同語法中支援的所有運算子。此外,管道語法會引入其他管道運算子,並使用經過修改的語法進行匯總和彙整。以下各節將說明這些運算子。如需所有支援的運算子,請參閱管道運算子的完整清單。
EXTEND
管道運算子
EXTEND
管道運算子可讓您將計算資料欄附加至目前的資料表。EXTEND
管道運算子與 SELECT *, new_column
陳述式類似,但在參照資料欄別名時更具彈性。
請參考下表,其中包含每位考生的兩次測驗成績:
CREATE OR REPLACE TABLE mydataset.Scores AS (
SELECT 'Alex' AS student, 9 AS score1, 10 AS score2, 10 AS points_possible
UNION ALL
SELECT 'Dana' AS student, 5 AS score1, 7 AS score2, 10 AS points_possible);
/*---------+--------+--------+-----------------+
| student | score1 | score2 | points_possible |
+---------+--------+--------+-----------------+
| Alex | 9 | 10 | 10 |
| Dana | 5 | 7 | 10 |
+---------+--------+--------+-----------------*/
假設您想計算每位學生在測驗中獲得的平均原始分數和平均百分比分數。在標準語法中,SELECT
陳述式中較後面的資料欄無法查看先前的別名。為避免使用子查詢,您必須重複平均值的運算式:
SELECT student,
(score1 + score2) / 2 AS average_score,
(score1 + score2) / 2 / points_possible AS average_percent
FROM mydataset.Scores;
EXTEND
管道運算子可參照先前使用的別名,讓查詢更易於閱讀,並減少發生錯誤的可能性:
FROM mydataset.Scores
|> EXTEND (score1 + score2) / 2 AS average_score
|> EXTEND average_score / points_possible AS average_percent
|> SELECT student, average_score, average_percent;
/*---------+---------------+-----------------+
| student | average_score | average_percent |
+---------+---------------+-----------------+
| Alex | 9.5 | .95 |
| Dana | 6.0 | 0.6 |
+---------+---------------+-----------------*/
SET
管道運算子
SET
管道運算子可讓您取代目前資料表中資料欄的值。SET
管道運算子類似 SELECT
* REPLACE (expression AS column)
陳述式。您可以使用資料表別名限定資料欄名稱,藉此參照原始值。
FROM (SELECT 3 AS x, 5 AS y)
|> SET x = 2 * x;
/*---+---+
| x | y |
+---+---+
| 6 | 5 |
+---+---*/
DROP
管道運算子
DROP
管道運算子可讓您從目前的資料表中移除資料欄。DROP
管道運算子類似 SELECT *
EXCEPT(column)
陳述式。資料欄刪除後,您仍可透過使用表格別名限定資料欄名稱,來參照原始值。
FROM (SELECT 1 AS x, 2 AS y) AS t
|> DROP x;
/*---+
| y |
+---+
| 2 |
+---*/
RENAME
管道運算子
RENAME
管道運算子可讓您重新命名目前資料表中的資料欄。RENAME
管道運算子類似 SELECT *
EXCEPT(old_column), old_column AS new_column
陳述式。
FROM (SELECT 1 AS x, 2 AS y, 3 AS z) AS t
|> RENAME y AS w;
/*---+---+---+
| x | w | z |
+---+---+---+
| 1 | 2 | 3 |
+---+---+---*/
AGGREGATE
管道運算子
如要在管道語法中執行匯總作業,請使用 AGGREGATE
管道運算子,後面加上任意數量的匯總函式,然後再加上 GROUP BY
子句。您不需要在 SELECT
子句中重複資料欄。
本節的範例使用 Produce
資料表:
CREATE OR REPLACE TABLE mydataset.Produce AS (
SELECT 'apples' AS item, 2 AS sales, 'fruit' AS category
UNION ALL
SELECT 'apples' AS item, 7 AS sales, 'fruit' AS category
UNION ALL
SELECT 'carrots' AS item, 0 AS sales, 'vegetable' AS category
UNION ALL
SELECT 'bananas' AS item, 15 AS sales, 'fruit' AS category);
/*---------+-------+-----------+
| item | sales | category |
+---------+-------+-----------+
| apples | 7 | fruit |
| apples | 2 | fruit |
| carrots | 0 | vegetable |
| bananas | 15 | fruit |
+---------+-------+-----------*/
FROM mydataset.Produce
|> AGGREGATE SUM(sales) AS total, COUNT(*) AS num_records
GROUP BY item, category;
/*---------+-----------+-------+-------------+
| item | category | total | num_records |
+---------+-----------+-------+-------------+
| apples | fruit | 9 | 2 |
| carrots | vegetable | 0 | 1 |
| bananas | fruit | 15 | 1 |
+---------+-----------+-------+-------------*/
如果您準備好在匯總後立即排序結果,可以在 GROUP BY
子句中標記要排序的資料欄,並使用 ASC
或 DESC
進行排序。未標示的資料欄不會排序。
如果您想為所有資料欄排序,可以將 GROUP BY
子句替換為 GROUP AND ORDER BY
子句,預設會依遞增順序排序每個資料欄。您可以指定 DESC
後面接著要以遞減順序排序的資料欄。舉例來說,以下三個查詢的作用相同:
-- Use a separate ORDER BY clause.
FROM mydataset.Produce
|> AGGREGATE SUM(sales) AS total, COUNT(*) AS num_records
GROUP BY category, item
|> ORDER BY category DESC, item;
-- Explicitly mark how to order columns in the GROUP BY clause.
FROM mydataset.Produce
|> AGGREGATE SUM(sales) AS total, COUNT(*) AS num_records
GROUP BY category DESC, item ASC;
-- Only mark descending columns in the GROUP AND ORDER BY clause.
FROM mydataset.Produce
|> AGGREGATE SUM(sales) AS total, COUNT(*) AS num_records
GROUP AND ORDER BY category DESC, item;
使用 GROUP AND ORDER BY
子句的好處是,您不必在兩個地方重複資料欄名稱。
如要執行完整的資料表匯總作業,請使用 GROUP BY()
,或完全省略 GROUP BY
子句:
FROM mydataset.Produce
|> AGGREGATE SUM(sales) AS total, COUNT(*) AS num_records;
/*-------+-------------+
| total | num_records |
+-------+-------------+
| 24 | 4 |
+-------+-------------*/
JOIN
管道運算子
JOIN
管道運算子可讓您將目前的資料表與其他資料表彙整,並支援標準彙整作業,包括 CROSS
、INNER
、LEFT
、RIGHT
和 FULL
。
以下範例會參照 Produce
和 ItemData
資料表:
CREATE OR REPLACE TABLE mydataset.Produce AS (
SELECT 'apples' AS item, 2 AS sales, 'fruit' AS category
UNION ALL
SELECT 'apples' AS item, 7 AS sales, 'fruit' AS category
UNION ALL
SELECT 'carrots' AS item, 0 AS sales, 'vegetable' AS category
UNION ALL
SELECT 'bananas' AS item, 15 AS sales, 'fruit' AS category);
CREATE OR REPLACE TABLE mydataset.ItemData AS (
SELECT 'apples' AS item, '123' AS id
UNION ALL
SELECT 'bananas' AS item, '456' AS id
UNION ALL
SELECT 'carrots' AS item, '789' AS id
);
以下範例使用 USING
子句,避免資料欄出現歧義:
FROM mydataset.Produce
|> JOIN mydataset.ItemData USING(item)
|> WHERE item = 'apples';
/*--------+-------+----------+-----+
| item | sales | category | id |
+--------+-------+----------+-----+
| apples | 2 | fruit | 123 |
| apples | 7 | fruit | 123 |
+--------+-------+----------+-----*/
如要參照目前資料表中的資料欄 (例如在 ON
子句中區分資料欄),您必須使用 AS
管道運算子為目前資料表建立別名。您可以為彙整的資料表加上別名。您可以參照後續管道運算子的兩個別名:
FROM mydataset.Produce
|> AS produce_table
|> JOIN mydataset.ItemData AS item_table
ON produce_table.item = item_table.item
|> WHERE produce_table.item = 'bananas'
|> SELECT item_table.item, sales, id;
/*---------+-------+-----+
| item | sales | id |
+---------+-------+-----+
| bananas | 15 | 123 |
+---------+-------+-----*/
彙整作業的右側無法查看彙整作業的左側,也就是說,您無法將目前的資料表與自身彙整。例如,下列查詢會失敗:
-- This query doesn't work.
FROM mydataset.Produce
|> AS produce_table
|> JOIN produce_table AS produce_table_2 USING(item);
如要與經過修改的資料表執行自連結,您可以在 WITH
子句中使用一般資料表運算式 (CTE)。
WITH cte_table AS (
FROM mydataset.Produce
|> WHERE item = 'carrots'
)
FROM cte_table
|> JOIN cte_table AS cte_table_2 USING(item);
範例
請參考下表,瞭解客戶訂單的相關資訊:
CREATE OR REPLACE TABLE mydataset.CustomerOrders AS (
SELECT 1 AS customer_id, 100 AS order_id, 'WA' AS state, 5 AS cost, 'clothing' AS item_type
UNION ALL
SELECT 1 AS customer_id, 101 AS order_id, 'WA' AS state, 20 AS cost, 'clothing' AS item_type
UNION ALL
SELECT 1 AS customer_id, 102 AS order_id, 'WA' AS state, 3 AS cost, 'food' AS item_type
UNION ALL
SELECT 2 AS customer_id, 103 AS order_id, 'NY' AS state, 16 AS cost, 'clothing' AS item_type
UNION ALL
SELECT 2 AS customer_id, 104 AS order_id, 'NY' AS state, 22 AS cost, 'housewares' AS item_type
UNION ALL
SELECT 2 AS customer_id, 104 AS order_id, 'WA' AS state, 45 AS cost, 'clothing' AS item_type
UNION ALL
SELECT 3 AS customer_id, 105 AS order_id, 'MI' AS state, 29 AS cost, 'clothing' AS item_type);
假設您想瞭解每個州和商品類型的重複消費者平均消費金額。您可以按照下列方式編寫查詢:
SELECT state, item_type, AVG(total_cost) AS average
FROM
(
SELECT
SUM(cost) AS total_cost,
customer_id,
state,
item_type,
COUNT(*) OVER (PARTITION BY customer_id) AS num_orders
FROM mydataset.CustomerOrders
GROUP BY customer_id, state, item_type
QUALIFY num_orders > 1
)
GROUP BY state, item_type
ORDER BY state DESC, item_type ASC;
如果從上到下閱讀查詢,會在定義 total_cost
之前遇到該欄。即使在子查詢中,您也必須先讀取資料欄的名稱,才能查看這些欄位來自哪個資料表。
為了瞭解這項查詢,我們需要從內到外閱讀。資料欄 state
和 item_type
在 SELECT
和 GROUP BY
子句中重複多次,然後在 ORDER BY
子句中重複一次。
以下等效查詢是使用管道符號語法編寫:
FROM mydataset.CustomerOrders
|> AGGREGATE SUM(cost) AS total_cost, GROUP BY customer_id, state, item_type
|> EXTEND COUNT(*) OVER (PARTITION BY customer_id) AS num_orders
|> WHERE num_orders > 1
|> AGGREGATE AVG(total_cost) AS average GROUP BY state DESC, item_type ASC;
/*-------+------------+---------+
| state | item_type | average |
+-------+------------+---------+
| WA | clothing | 35.0 |
| WA | food | 3.0 |
| NY | clothing | 16.0 |
| NY | housewares | 22.0 |
+-------+------------+---------*/
使用管道符號語法時,您可以編寫查詢,按照您可能會想到的邏輯步驟解決原始問題。查詢中的語法行對應至下列邏輯步驟:
- 請先從客戶訂單表格開始。
- 瞭解每位消費者在各州購買各類型商品的花費。
- 計算每位客戶的訂單數量。
- 將結果限制為重複購買的客戶。
- 找出回流顧客在各州和商品類型上的平均支出金額。
限制
後續步驟
- 使用管道語法分析資料
- 管道查詢語法參考資料
- 標準查詢語法參考資料
- VLDB 2024 大會論文:管道語法