Data manipulation language (DML) statements in GoogleSQL
The BigQuery data manipulation language (DML) enables you to update, insert, and delete data from your BigQuery tables.
For information about how to use DML statements, see Transform data with data manipulation language and Update partitioned table data using DML.
On-demand query size calculation
If you use on-demand billing, BigQuery charges for data manipulation language (DML) statements based on the number of bytes processed by the statement.
For more information about cost estimation, see Estimate and control costs.
Non-partitioned tables
For non-partitioned tables, the number of bytes processed is calculated as follows:
- q = The sum of bytes processed by the DML statement itself, including any columns referenced in tables scanned by the DML statement.
- t = The size of the table being updated by the DML statement before any modifications are made.
| DML statement | Bytes processed |
|---|---|
INSERT |
q |
UPDATE |
q + t |
DELETE |
q + t |
MERGE |
If there are only INSERT clauses: q. If there is an UPDATE or DELETE clause: q + t. |
To preview how many bytes a statement processes, Check the estimated cost before running a query.
Partitioned tables
For partitioned tables, the number of bytes processed is calculated as follows:
- q' = The sum of bytes processed by the DML statement itself, including any columns referenced in all partitions scanned by the DML statement.
- t' = The total size of all partitions being updated by the DML statement before any modifications are made.
| DML statement | Bytes processed |
|---|---|
INSERT |
q' |
UPDATE |
q' + t' |
DELETE |
q' + t' |
MERGE |
If there are only INSERT clauses in the MERGE statement: q'. If there is an UPDATE or DELETE clause in the MERGE statement: q' + t'. |
To preview how many bytes a statement processes, Check the estimated cost before running a query.
INSERT statement
Use the INSERT statement when you want to add new rows to a table.
INSERT [INTO] target_name
[(column_1 [, ..., column_n ] )]
input
input ::=
VALUES (expr_1 [, ..., expr_n ] )
[, ..., (expr_k_1 [, ..., expr_k_n ] ) ]
| SELECT_QUERY
expr ::= value_expression | DEFAULT
INSERT statements must comply with the following rules:
- Column names are optional if the target table is not an ingestion-time partitioned table.
- Duplicate names are not allowed in the list of target columns.
- Values must be added in the same order as the specified columns.
- The number of values added must match the number of specified columns.
- Values must have a type that is compatible with the target column.
- When the value expression is
DEFAULT, the default value for the column is used. If the column has no default value, the value defaults toNULL.
Omitting column names
When the column names are omitted, all columns in the target table are included
in ascending order based on their ordinal positions. If an omitted column has
a default value, then that value is used. Otherwise, the column value is NULL.
If the target
table is an
ingestion-time partitioned table,
column names must be specified.
Value type compatibility
Values added with an INSERT statement must be compatible with the target
column's type. A value's type is considered compatible with the target column's
type if one of the following criteria are met:
- The value type matches the column type exactly. For example, inserting a value of type INT64 in a column that also has a type of INT64.
- The value type is one that can be implicitly coerced into another type.
INSERT examples
INSERT using explicit values
INSERT dataset.Inventory (product, quantity)
VALUES('top load washer', 10),
('front load washer', 20),
('dryer', 30),
('refrigerator', 10),
('microwave', 20),
('dishwasher', 30),
('oven', 5)
+-------------------+----------+--------------------+ | product | quantity | supply_constrained | +-------------------+----------+--------------------+ | dishwasher | 30 | NULL | | dryer | 30 | NULL | | front load washer | 20 | NULL | | microwave | 20 | NULL | | oven | 5 | NULL | | refrigerator | 10 | NULL | | top load washer | 10 | NULL | +-------------------+----------+--------------------+
If you set a default value for a column, then you can use the DEFAULT keyword
in place of a value to insert the default value:
ALTER TABLE dataset.NewArrivals ALTER COLUMN quantity SET DEFAULT 100;
INSERT dataset.NewArrivals (product, quantity, warehouse)
VALUES('top load washer', DEFAULT, 'warehouse #1'),
('dryer', 200, 'warehouse #2'),
('oven', 300, 'warehouse #3');
+-----------------+----------+--------------+ | product | quantity | warehouse | +-----------------+----------+--------------+ | dryer | 200 | warehouse #2 | | oven | 300 | warehouse #3 | | top load washer | 100 | warehouse #1 | +-----------------+----------+--------------+
INSERT SELECT statement
INSERT dataset.Warehouse (warehouse, state)
SELECT *
FROM UNNEST([('warehouse #1', 'WA'),
('warehouse #2', 'CA'),
('warehouse #3', 'WA')])
+--------------+-------+ | warehouse | state | +--------------+-------+ | warehouse #1 | WA | | warehouse #2 | CA | | warehouse #3 | WA | +--------------+-------+
You can also use WITH when using INSERT SELECT. For example, you can
rewrite the previous query using WITH:
INSERT dataset.Warehouse (warehouse, state)
WITH w AS (
SELECT ARRAY<STRUCT<warehouse string, state string>>
[('warehouse #1', 'WA'),
('warehouse #2', 'CA'),
('warehouse #3', 'WA')] col
)
SELECT warehouse, state FROM w, UNNEST(w.col)
The following example shows how to copy a table's contents into another table:
INSERT dataset.DetailedInventory (product, quantity, supply_constrained)
SELECT product, quantity, false
FROM dataset.Inventory
+----------------------+----------+--------------------+----------+----------------+ | product | quantity | supply_constrained | comments | specifications | +----------------------+----------+--------------------+----------+----------------+ | dishwasher | 30 | false | [] | NULL | | dryer | 30 | false | [] | NULL | | front load washer | 20 | false | [] | NULL | | microwave | 20 | false | [] | NULL | | oven | 5 | false | [] | NULL | | refrigerator | 10 | false | [] | NULL | | top load washer | 10 | false | [] | NULL | +----------------------+----------+--------------------+----------+----------------+
INSERT VALUES with subquery
The following example shows how to insert a row into a table, where one of the values is computed using a subquery:
INSERT dataset.DetailedInventory (product, quantity)
VALUES('countertop microwave',
(SELECT quantity FROM dataset.DetailedInventory
WHERE product = 'microwave'))
+----------------------+----------+--------------------+----------+----------------+ | product | quantity | supply_constrained | comments | specifications | +----------------------+----------+--------------------+----------+----------------+ | countertop microwave | 20 | NULL | [] | NULL | | dishwasher | 30 | false | [] | NULL | | dryer | 30 | false | [] | NULL | | front load washer | 20 | false | [] | NULL | | microwave | 20 | false | [] | NULL | | oven | 5 | false | [] | NULL | | refrigerator | 10 | false | [] | NULL | | top load washer | 10 | false | [] | NULL | +----------------------+----------+--------------------+----------+----------------+
INSERT without column names
INSERT dataset.Warehouse VALUES('warehouse #4', 'WA'), ('warehouse #5', 'NY')
This is the Warehouse table before you run the query:
+--------------+-------+ | warehouse | state | +--------------+-------+ | warehouse #1 | WA | | warehouse #2 | CA | | warehouse #3 | WA | +--------------+-------+
This is the Warehouse table after you run the query:
+--------------+-------+ | warehouse | state | +--------------+-------+ | warehouse #1 | WA | | warehouse #2 | CA | | warehouse #3 | WA | | warehouse #4 | WA | | warehouse #5 | NY | +--------------+-------+
INSERT with STRUCT types
The following example shows how to insert a row into a table, where some of
the fields are
STRUCT types.
INSERT dataset.DetailedInventory
VALUES('top load washer', 10, FALSE, [(CURRENT_DATE, "comment1")], ("white","1 year",(30,40,28))),
('front load washer', 20, FALSE, [(CURRENT_DATE, "comment1")], ("beige","1 year",(35,45,30)))
Here is the DetailedInventory table after you run the query:
+-------------------+----------+--------------------+-------------------------------------------------+----------------------------------------------------------------------------------------------------+
| product | quantity | supply_constrained | comments | specifications |
+-------------------+----------+--------------------+-------------------------------------------------+----------------------------------------------------------------------------------------------------+
| front load washer | 20 | false | [{"created":"2021-02-09","comment":"comment1"}] | {"color":"beige","warranty":"1 year","dimensions":{"depth":"35.0","height":"45.0","width":"30.0"}} |
| top load washer | 10 | false | [{"created":"2021-02-09","comment":"comment1"}] | {"color":"white","warranty":"1 year","dimensions":{"depth":"30.0","height":"40.0","width":"28.0"}} |
+-------------------+----------+--------------------+-------------------------------------------------+----------------------------------------------------------------------------------------------------+INSERT with ARRAY types
The following example show how to insert a row into a table, where one of the
fields is an ARRAY type.
CREATE TABLE IF NOT EXISTS dataset.table1 (names ARRAY<STRING>);
INSERT INTO dataset.table1 (names)
VALUES (["name1","name2"])
Here is the table after you run the query:
+-------------------+
| names |
+-------------------+
| ["name1","name2"] |
+-------------------+
INSERT with RANGE types
The following example shows how to insert rows into a table, where the
fields are RANGE type.
INSERT mydataset.my_range_table (emp_id, dept_id, duration)
VALUES(10, 1000, RANGE<DATE> '[2010-01-10, 2010-03-10)'),
(10, 2000, RANGE<DATE> '[2010-03-10, 2010-07-15)'),
(10, 2000, RANGE<DATE> '[2010-06-15, 2010-08-18)'),
(20, 2000, RANGE<DATE> '[2010-03-10, 2010-07-20)'),
(20, 1000, RANGE<DATE> '[2020-05-10, 2020-09-20)');
SELECT * FROM mydataset.my_range_table ORDER BY emp_id;
/*--------+---------+--------------------------+
| emp_id | dept_id | duration |
+--------+---------+--------------------------+
| 10 | 1000 | [2010-01-10, 2010-03-10) |
| 10 | 2000 | [2010-03-10, 2010-07-15) |
| 10 | 2000 | [2010-06-15, 2010-08-18) |
| 20 | 2000 | [2010-03-10, 2010-07-20) |
| 20 | 1000 | [2020-05-10, 2020-09-20) |
+--------+---------+--------------------------*/
DELETE statement
Use the DELETE statement when you want to delete rows from a table.
DELETE [FROM] target_name [alias]
WHERE condition
To delete all rows in a table, use the TRUNCATE TABLE statement.
To delete all rows in a partition without scanning bytes or consuming slots, see Using DML DELETE to delete partitions.
WHERE keyword
Each time you construct a DELETE statement, you must use the WHERE keyword,
followed by a condition.
The WHERE keyword is mandatory for any DELETE statement.
DELETE examples
DELETE with WHERE clause
DELETE dataset.Inventory
WHERE quantity = 0
Before:
+-------------------+----------+--------------------+ | product | quantity | supply_constrained | +-------------------+----------+--------------------+ | dishwasher | 20 | NULL | | dryer | 30 | NULL | | front load washer | 10 | NULL | | microwave | 20 | NULL | | oven | 5 | NULL | | refrigerator | 10 | NULL | | top load washer | 0 | NULL | +-------------------+----------+--------------------+
After:
+-------------------+----------+--------------------+ | product | quantity | supply_constrained | +-------------------+----------+--------------------+ | dishwasher | 20 | NULL | | dryer | 30 | NULL | | front load washer | 10 | NULL | | microwave | 20 | NULL | | oven | 5 | NULL | | refrigerator | 10 | NULL | +-------------------+----------+--------------------+
DELETE with subquery
DELETE dataset.Inventory i
WHERE i.product NOT IN (SELECT product from dataset.NewArrivals)
Before:
Inventory +-------------------+----------+--------------------+ | product | quantity | supply_constrained | +-------------------+----------+--------------------+ | dishwasher | 30 | NULL | | dryer | 30 | NULL | | front load washer | 20 | NULL | | microwave | 20 | NULL | | oven | 5 | NULL | | refrigerator | 10 | NULL | | top load washer | 10 | NULL | +-------------------+----------+--------------------+ NewArrivals +-----------------+----------+--------------+ | product | quantity | warehouse | +-----------------+----------+--------------+ | dryer | 200 | warehouse #2 | | oven | 300 | warehouse #3 | | top load washer | 100 | warehouse #1 | +-----------------+----------+--------------+
After:
Inventory +-----------------+----------+--------------------+ | product | quantity | supply_constrained | +-----------------+----------+--------------------+ | dryer | 30 | NULL | | oven | 5 | NULL | | top load washer | 10 | NULL | +-----------------+----------+--------------------+
Alternately, you can use DELETE with the EXISTS clause:
DELETE dataset.Inventory
WHERE NOT EXISTS
(SELECT * from dataset.NewArrivals
WHERE Inventory.product = NewArrivals.product)
TRUNCATE TABLE statement
The TRUNCATE TABLE statement removes all rows from a table but leaves the
table metadata intact, including the table schema, description, and labels.
TRUNCATE TABLE [[project_name.]dataset_name.]table_name
Where:
project_nameis the name of the project containing the table. Defaults to the project that runs this DDL query.dataset_nameis the name of the dataset containing the table.table_nameis the name of the table to truncate.
Truncating views, materialized views, models, or external tables is not
supported. Quotas and limits for queries apply to TRUNCATE TABLE statements.
For more information, see Quotas and limits.
TRUNCATE TABLE examples
The following example removes all rows from the table named Inventory.
TRUNCATE TABLE dataset.Inventory
UPDATE statement
Use the UPDATE statement when you want to update existing rows within a table.
UPDATE target_name [[AS] alias]
SET set_clause
[FROM from_clause]
WHERE condition
set_clause ::= update_item[, ...]
update_item ::= column_name = expression
Where:
target_nameis the name of a table to update.update_itemis the name of column to update and an expression to evaluate for the updated value. The expression may contain theDEFAULTkeyword, which is replaced by the default value for that column.
If the column is a STRUCT type, column_name can reference a field in the
STRUCT using dot notation. For example, struct1.field1.
WHERE keyword
Each UPDATE statement must include the WHERE keyword, followed by a
condition.
To update all rows in the table, use WHERE true.
FROM keyword
An UPDATE statement can optionally include a FROM clause.
You can use the FROM clause to specify the rows to update in the target table.
You can also use columns from joined tables in a SET clause or WHERE
condition.
The FROM clause join can be a cross join if no condition is specified in the
WHERE clause, otherwise it is an inner join. In either case, rows from the
target table can join with at most one row from the FROM clause.
To specify the join predicate between the table to be updated and tables in
the FROM clause, use the WHERE clause. For an example, see
UPDATE using joins.
Caveats:
- The
SETclause can reference columns from a target table and columns from anyFROMitem in theFROMclause. If there is a name collision, unqualified references are treated as ambiguous. - If the target table is present in the
FROMclause as a table name, it must have an alias if you would like to perform a self-join. - If a row in the table to be updated joins with zero rows from the
FROMclause, then the row isn't updated. - If a row in the table to be updated joins with exactly one row from the
FROMclause, then the row is updated. - If a row in the table to be updated joins with more than one row from the
FROMclause, then the query generates the following runtime error:UPDATE/MERGE must match at most one source row for each target row.
UPDATE examples
UPDATE with WHERE clause
The following example updates a table named Inventory by reducing the value
of the quantity field by 10 for all products that contain the string washer.
Assume that the default value for the supply_constrained column is set to
TRUE.
UPDATE dataset.Inventory
SET quantity = quantity - 10,
supply_constrained = DEFAULT
WHERE product like '%washer%'
Before:
Inventory +-------------------+----------+--------------------+ | product | quantity | supply_constrained | +-------------------+----------+--------------------+ | dishwasher | 30 | NULL | | dryer | 30 | NULL | | front load washer | 20 | NULL | | microwave | 20 | NULL | | oven | 5 | NULL | | refrigerator | 10 | NULL | | top load washer | 10 | NULL | +-------------------+----------+--------------------+
After:
Inventory +-------------------+----------+--------------------+ | product | quantity | supply_constrained | +-------------------+----------+--------------------+ | dishwasher | 20 | true | | dryer | 30 | NULL | | front load washer | 10 | true | | microwave | 20 | NULL | | oven | 5 | NULL | | refrigerator | 10 | NULL | | top load washer | 0 | true | +-------------------+----------+--------------------+
UPDATE using joins
The following example generates a table with inventory totals that include
existing inventory and inventory from the NewArrivals table, and
marks supply_constrained as false:
UPDATE dataset.Inventory
SET quantity = quantity +
(SELECT quantity FROM dataset.NewArrivals
WHERE Inventory.product = NewArrivals.product),
supply_constrained = false
WHERE product IN (SELECT product FROM dataset.NewArrivals)
Alternately, you can join the tables:
UPDATE dataset.Inventory i
SET quantity = i.quantity + n.quantity,
supply_constrained = false
FROM dataset.NewArrivals n
WHERE i.product = n.product
Before:
Inventory +-------------------+----------+--------------------+ | product | quantity | supply_constrained | +-------------------+----------+--------------------+ | dishwasher | 30 | NULL | | dryer | 30 | NULL | | front load washer | 20 | NULL | | microwave | 20 | NULL | | oven | 5 | NULL | | refrigerator | 10 | NULL | | top load washer | 10 | NULL | +-------------------+----------+--------------------+ NewArrivals +-----------------+----------+--------------+ | product | quantity | warehouse | +-----------------+----------+--------------+ | dryer | 200 | warehouse #2 | | oven | 300 | warehouse #3 | | top load washer | 100 | warehouse #1 | +-----------------+----------+--------------+
After:
+-------------------+----------+--------------------+ | product | quantity | supply_constrained | +-------------------+----------+--------------------+ | dishwasher | 30 | NULL | | dryer | 230 | false | | front load washer | 20 | NULL | | microwave | 20 | NULL | | oven | 305 | false | | refrigerator | 10 | NULL | | top load washer | 110 | false | +-------------------+----------+--------------------+
UPDATE nested fields
The following example updates nested record fields.
UPDATE dataset.DetailedInventory
SET specifications.color = 'white',
specifications.warranty = '1 year'
WHERE product like '%washer%'
Alternatively, you can update the entire record:
UPDATE dataset.DetailedInventory
SET specifications
= STRUCT<color STRING, warranty STRING,
dimensions STRUCT<depth FLOAT64, height FLOAT64, width FLOAT64>>('white', '1 year', NULL)
WHERE product like '%washer%'
+----------------------+----------+--------------------+----------+---------------------------------------------------------+
| product | quantity | supply_constrained | comments | specifications |
+----------------------+----------+--------------------+----------+---------------------------------------------------------+
| countertop microwave | 20 | NULL | [] | NULL |
| dishwasher | 30 | false | [] | {"color":"white","warranty":"1 year","dimensions":null} |
| dryer | 30 | false | [] | NULL |
| front load washer | 20 | false | [] | {"color":"white","warranty":"1 year","dimensions":null} |
| microwave | 20 | false | [] | NULL |
| oven | 5 | false | [] | NULL |
| refrigerator | 10 | false | [] | NULL |
| top load washer | 10 | false | [] | {"color":"white","warranty":"1 year","dimensions":null} |
+----------------------+----------+--------------------+----------+---------------------------------------------------------+
UPDATE repeated records
The following example appends an entry to a repeated record in the comments
column for products that contain the string washer:
UPDATE dataset.DetailedInventory
SET comments = ARRAY(
SELECT comment FROM UNNEST(comments) AS comment
UNION ALL
SELECT (CAST('2016-01-01' AS DATE), 'comment1')
)
WHERE product like '%washer%'
+----------------------+----------+--------------------+----------------------------------------------------+----------------+
| product | quantity | supply_constrained | comments | specifications |
+----------------------+----------+--------------------+----------------------------------------------------+----------------+
| countertop microwave | 20 | NULL | [] | NULL |
| dishwasher | 30 | false | [u'{"created":"2016-01-01","comment":"comment1"}'] | NULL |
| dryer | 30 | false | [] | NULL |
| front load washer | 20 | false | [u'{"created":"2016-01-01","comment":"comment1"}'] | NULL |
| microwave | 20 | false | [] | NULL |
| oven | 5 | false | [] | NULL |
| refrigerator | 10 | false | [] | NULL |
| top load washer | 10 | false | [u'{"created":"2016-01-01","comment":"comment1"}'] | NULL |
+----------------------+----------+--------------------+----------------------------------------------------+----------------+
Alternatively, you can use the ARRAY_CONCAT function:
UPDATE dataset.DetailedInventory
SET comments = ARRAY_CONCAT(comments,
ARRAY<STRUCT<created DATE, comment STRING>>[(CAST('2016-01-01' AS DATE), 'comment1')])
WHERE product like '%washer%'
The following example appends a second entry to the repeated record in the
comments column for all rows:
UPDATE dataset.DetailedInventory
SET comments = ARRAY(
SELECT comment FROM UNNEST(comments) AS comment
UNION ALL
SELECT (CAST('2016-01-01' AS DATE), 'comment2')
)
WHERE true
SELECT product, comments FROM dataset.DetailedInventory
+----------------------+------------------------------------------------------------------------------------------------------+
| product | comments |
+----------------------+------------------------------------------------------------------------------------------------------+
| countertop microwave | [u'{"created":"2016-01-01","comment":"comment2"}'] |
| dishwasher | [u'{"created":"2016-01-01","comment":"comment1"}', u'{"created":"2016-01-01","comment":"comment2"}'] |
| dryer | [u'{"created":"2016-01-01","comment":"comment2"}'] |
| front load washer | [u'{"created":"2016-01-01","comment":"comment1"}', u'{"created":"2016-01-01","comment":"comment2"}'] |
| microwave | [u'{"created":"2016-01-01","comment":"comment2"}'] |
| oven | [u'{"created":"2016-01-01","comment":"comment2"}'] |
| refrigerator | [u'{"created":"2016-01-01","comment":"comment2"}'] |
| top load washer | [u'{"created":"2016-01-01","comment":"comment1"}', u'{"created":"2016-01-01","comment":"comment2"}'] |
+----------------------+------------------------------------------------------------------------------------------------------+
To delete repeated value entries, you can use WHERE ... NOT LIKE:
UPDATE dataset.DetailedInventory
SET comments = ARRAY(
SELECT c FROM UNNEST(comments) AS c
WHERE c.comment NOT LIKE '%comment2%'
)
WHERE true
+----------------------+----------+--------------------+----------------------------------------------------+----------------+
| product | quantity | supply_constrained | comments | specifications |
+----------------------+----------+--------------------+----------------------------------------------------+----------------+
| countertop microwave | 20 | NULL | [] | NULL |
| dishwasher | 30 | false | [u'{"created":"2016-01-01","comment":"comment1"}'] | NULL |
| dryer | 30 | false | [] | NULL |
| front load washer | 20 | false | [u'{"created":"2016-01-01","comment":"comment1"}'] | NULL |
| microwave | 20 | false | [] | NULL |
| oven | 5 | false | [] | NULL |
| refrigerator | 10 | false | [] | NULL |
| top load washer | 10 | false | [u'{"created":"2016-01-01","comment":"comment1"}'] | NULL |
+----------------------+----------+--------------------+----------------------------------------------------+----------------+
UPDATE statement using join between three tables
The following example sets supply_constrained to true for all products from
NewArrivals where the warehouse location is in 'WA' state.
UPDATE dataset.DetailedInventory
SET supply_constrained = true
FROM dataset.NewArrivals, dataset.Warehouse
WHERE DetailedInventory.product = NewArrivals.product AND
NewArrivals.warehouse = Warehouse.warehouse AND
Warehouse.state = 'WA'
Note that the join predicate for the join with the updated table
(DetailedInventory) must be specified using WHERE. However, joins between
the other tables (NewArrivals and Warehouse) can be specified using an
explicit JOIN ... ON clause. For example, the following query is equivalent
to the previous query:
UPDATE dataset.DetailedInventory
SET supply_constrained = true
FROM dataset.NewArrivals
INNER JOIN dataset.Warehouse
ON NewArrivals.warehouse = Warehouse.warehouse
WHERE DetailedInventory.product = NewArrivals.product AND
Warehouse.state = 'WA'
Before:
DetailedInventory +----------------------+----------+--------------------+----------+----------------+ | product | quantity | supply_constrained | comments | specifications | +----------------------+----------+--------------------+----------+----------------+ | countertop microwave | 20 | NULL | [] | NULL | | dishwasher | 30 | false | [] | NULL | | dryer | 30 | false | [] | NULL | | front load washer | 20 | false | [] | NULL | | microwave | 20 | false | [] | NULL | | oven | 5 | false | [] | NULL | | refrigerator | 10 | false | [] | NULL | | top load washer | 10 | false | [] | NULL | +----------------------+----------+--------------------+----------+----------------+ New arrivals +-----------------+----------+--------------+ | product | quantity | warehouse | +-----------------+----------+--------------+ | dryer | 200 | warehouse #2 | | oven | 300 | warehouse #3 | | top load washer | 100 | warehouse #1 | +-----------------+----------+--------------+ Warehouse +--------------+-------+ | warehouse | state | +--------------+-------+ | warehouse #1 | WA | | warehouse #2 | CA | | warehouse #3 | WA | +--------------+-------+
After:
+----------------------+----------+--------------------+----------+----------------+ | product | quantity | supply_constrained | comments | specifications | +----------------------+----------+--------------------+----------+----------------+ | countertop microwave | 20 | NULL | [] | NULL | | dishwasher | 30 | false | [] | NULL | | dryer | 30 | false | [] | NULL | | front load washer | 20 | false | [] | NULL | | microwave | 20 | false | [] | NULL | | oven | 5 | true | [] | NULL | | refrigerator | 10 | false | [] | NULL | | top load washer | 10 | true | [] | NULL | +----------------------+----------+--------------------+----------+----------------+
MERGE statement
A MERGE statement
is a DML statement that can combine INSERT, UPDATE, and DELETE
operations into a single statement and perform the operations atomically.
MERGE [INTO] target_name [[AS] alias]
USING source_name [[AS] alias]
ON merge_condition
{ when_clause } +
when_clause ::= matched_clause | not_matched_by_target_clause | not_matched_by_source_clause
matched_clause ::= WHEN MATCHED [ AND search_condition ] THEN { merge_update_clause | merge_delete_clause }
not_matched_by_target_clause ::= WHEN NOT MATCHED [BY TARGET] [ AND search_condition ] THEN merge_insert_clause
not_matched_by_source_clause ::= WHEN NOT MATCHED BY SOURCE [ AND search_condition ] THEN { merge_update_clause | merge_delete_clause }
merge_condition ::= bool_expression
search_condition ::= bool_expression
merge_update_clause ::= UPDATE SET update_item [, update_item]*
update_item ::= column_name = expression
merge_delete_clause ::= DELETE
merge_insert_clause ::= INSERT [(column_1 [, ..., column_n ])] input
input ::= VALUES (expr_1 [, ..., expr_n ]) | ROW
expr ::= expression | DEFAULT
Where:
target_nametarget_nameis the name of the table you’re changing.
source_namesource_nameis a table name or subquery.
merge_conditionA
MERGEstatement performs aJOINbetween the target and the source. Then, depending on the match status (row matched, only in source table, only in destination table), the correspondingWHENclause is executed. Themerge_conditionis used by theJOINto match rows between source and target tables. Depending on the combination ofWHENclauses, differentINNERandOUTERJOINtypes are applied.If the merge_condition is
FALSE, the query optimizer avoids using aJOIN. This optimization is referred to as a constant false predicate. A constant false predicate is useful when you perform an atomicDELETEon the target plus anINSERTfrom a source (DELETEwithINSERTis also known as aREPLACEoperation).If the columns used in the
merge_conditionboth containNULLvalues, specify something likeX = Y OR (X IS NULL AND Y IS NULL). This lets you avoid the case where the join is based on twoNULLvalues.NULL = NULLevaluates toNULLinstead ofTRUE, and creates duplicate rows in the results.
when_clauseThe
when_clausehas three options:MATCHED,NOT MATCHED BY TARGETandNOT MATCHED BY SOURCE. There must be at least onewhen_clausein eachMERGEstatement.Each
when_clausecan have an optionalsearch_condition. Thewhen_clauseis executed for a row if both themerge_conditionandsearch_conditionare satisfied. When there are multiple qualified clauses, only the firstwhen_clauseis executed for a row.
matched_clauseThe
matched_clausedefines how to update or delete a row in the target table if that row matches a row in the source table.If there is at least one
matched_clauseperforming anUPDATEoperation, a runtime error is returned when multiple rows from the source table match one row from the target table, and you are trying to update or delete that row in the target table.
not_matched_by_target_clause- The
not_matched_by_target_clausedefines how to insert into the target table if a row from source table does not match any row in the target table.
not_matched_by_source_clause- The
not_matched_by_source_clausedefines how to update or delete a row in the target table if that row does not match any row in the source table.
Omitting column names
- In the
not_matched_by_target_clause, when the column names of target table are omitted, all columns in the target table are included in ascending order based on their ordinal positions. Note that, if the target table is an ingestion-time partitioned table, column names must be specified. - In the
not_matched_by_target_clause,ROWcan be used to include all the columns of the source in the ascending sequence of their ordinal positions. None of the pseudocolumns of the source are included. For example, the pseudocolumn_PARTITIONTIMEis not included when the source is an ingestion-time partitioned table.
MERGE examples
Example 1
In the following example, the query adds new items from the Inventory table to the
DetailedInventory table. For items with low inventory, the
supply_constrained value is set to true, and comments are added.
MERGE dataset.DetailedInventory T
USING dataset.Inventory S
ON T.product = S.product
WHEN NOT MATCHED AND quantity < 20 THEN
INSERT(product, quantity, supply_constrained, comments)
VALUES(product, quantity, true, ARRAY<STRUCT<created DATE, comment STRING>>[(DATE('2016-01-01'), 'comment1')])
WHEN NOT MATCHED THEN
INSERT(product, quantity, supply_constrained)
VALUES(product, quantity, false)
These are the tables before you run the query:
Inventory +-------------------+----------+ | product | quantity | +-------------------+----------+ | dishwasher | 30 | | dryer | 30 | | front load washer | 20 | | microwave | 20 | | oven | 5 | | top load washer | 10 | +-------------------+----------+
DetailedInventory +----------------------+----------+--------------------+----------+----------------+ | product | quantity | supply_constrained | comments | specifications | +----------------------+----------+--------------------+----------+----------------+ | countertop microwave | 20 | NULL | [] | NULL | | front load washer | 20 | false | [] | NULL | | microwave | 20 | false | [] | NULL | | refrigerator | 10 | false | [] | NULL | +----------------------+----------+--------------------+----------+----------------+
This is the DetailedInventory table after you run the query:
DetailedInventory
+----------------------+----------+--------------------+-------------------------------------------------+----------------+
| product | quantity | supply_constrained | comments | specifications |
+----------------------+----------+--------------------+-------------------------------------------------+----------------+
| countertop microwave | 20 | NULL | [] | NULL |
| dishwasher | 30 | false | [] | NULL |
| dryer | 30 | false | [] | NULL |
| front load washer | 20 | false | [] | NULL |
| microwave | 20 | false | [] | NULL |
| oven | 5 | true | [{"created":"2016-01-01","comment":"comment1"}] | NULL |
| refrigerator | 10 | false | [] | NULL |
| top load washer | 10 | true | [{"created":"2016-01-01","comment":"comment1"}] | NULL |
+----------------------+----------+--------------------+-------------------------------------------------+----------------+
Example 2
In the following example, the query merges items from the NewArrivals table
into the Inventory table. If an item is already present in Inventory, the
query increments the quantity field. Otherwise, the query inserts a new row.
Assume that the default value for the supply_constrained column is set to
NULL.
MERGE dataset.Inventory T
USING dataset.NewArrivals S
ON T.product = S.product
WHEN MATCHED THEN
UPDATE SET quantity = T.quantity + S.quantity
WHEN NOT MATCHED THEN
INSERT (product, quantity) VALUES(product, quantity)
These are the tables before you run the query:
NewArrivals +-----------------+----------+--------------+ | product | quantity | warehouse | +-----------------+----------+--------------+ | dryer | 20 | warehouse #2 | | oven | 30 | warehouse #3 | | refrigerator | 25 | warehouse #2 | | top load washer | 10 | warehouse #1 | +-----------------+----------+--------------+
Inventory +-------------------+----------+--------------------+ | product | quantity | supply_constrained | +-------------------+----------+--------------------+ | dishwasher | 30 | false | | dryer | 30 | false | | front load washer | 20 | false | | microwave | 20 | false | | oven | 5 | true | | top load washer | 10 | true | +-------------------+----------+--------------------+
This is the Inventory table after you run the query:
Inventory +-------------------+----------+--------------------+ | product | quantity | supply_constrained | +-------------------+----------+--------------------+ | dishwasher | 30 | false | | dryer | 50 | false | | front load washer | 20 | false | | microwave | 20 | false | | oven | 35 | true | | refrigerator | 25 | NULL | | top load washer | 20 | true | +-------------------+----------+--------------------+
Example 3
In the following example, the query increases the quantity of products from
warehouse #1 by 20 in the NewArrivals table. The query deletes all other
products except for those from warehouse #2.
MERGE dataset.NewArrivals T
USING (SELECT * FROM dataset.NewArrivals WHERE warehouse <> 'warehouse #2') S
ON T.product = S.product
WHEN MATCHED AND T.warehouse = 'warehouse #1' THEN
UPDATE SET quantity = T.quantity + 20
WHEN MATCHED THEN
DELETE
This is the NewArrivals table before you run the query:
NewArrivals +-----------------+----------+--------------+ | product | quantity | warehouse | +-----------------+----------+--------------+ | dryer | 20 | warehouse #2 | | oven | 30 | warehouse #3 | | refrigerator | 25 | warehouse #2 | | top load washer | 10 | warehouse #1 | +-----------------+----------+--------------+
This is the NewArrivals table after you run the query:
NewArrivals +-----------------+----------+--------------+ | product | quantity | warehouse | +-----------------+----------+--------------+ | dryer | 20 | warehouse #2 | | refrigerator | 25 | warehouse #2 | | top load washer | 30 | warehouse #1 | +-----------------+----------+--------------+
Example 4
In the following example, the query replaces all products like '%washer%' in
the Inventory table by using the values in the NewArrivals table.
MERGE dataset.Inventory T
USING dataset.NewArrivals S
ON FALSE
WHEN NOT MATCHED AND product LIKE '%washer%' THEN
INSERT (product, quantity) VALUES(product, quantity)
WHEN NOT MATCHED BY SOURCE AND product LIKE '%washer%' THEN
DELETE
These are the tables before you run the query:
NewArrivals +-----------------+----------+--------------+ | product | quantity | warehouse | +-----------------+----------+--------------+ | dryer | 20 | warehouse #2 | | refrigerator | 25 | warehouse #2 | | top load washer | 30 | warehouse #1 | +-----------------+----------+--------------+
Inventory +-------------------+----------+ | product | quantity | +-------------------+----------+ | dishwasher | 30 | | dryer | 50 | | front load washer | 20 | | microwave | 20 | | oven | 35 | | refrigerator | 25 | | top load washer | 20 | +-------------------+----------+
This is the Inventory table after you run the query:
Inventory +-----------------+----------+ | product | quantity | +-----------------+----------+ | dryer | 50 | | microwave | 20 | | oven | 35 | | refrigerator | 25 | | top load washer | 30 | +-----------------+----------+
Example 5
In the following example, the query adds a comment in the DetailedInventory
table if the product has a lower than average quantity in Inventory table.
MERGE dataset.DetailedInventory T
USING dataset.Inventory S
ON T.product = S.product
WHEN MATCHED AND S.quantity < (SELECT AVG(quantity) FROM dataset.Inventory) THEN
UPDATE SET comments = ARRAY_CONCAT(comments, ARRAY<STRUCT<created DATE, comment STRING>>[(CAST('2016-02-01' AS DATE), 'comment2')])
These are the tables before you run the query:
Inventory +-----------------+----------+ | product | quantity | +-----------------+----------+ | dryer | 50 | | microwave | 20 | | oven | 35 | | refrigerator | 25 | | top load washer | 30 | +-----------------+----------+
DetailedInventory
+----------------------+----------+--------------------+-------------------------------------------------+----------------+
| product | quantity | supply_constrained | comments | specifications |
+----------------------+----------+--------------------+-------------------------------------------------+----------------+
| countertop microwave | 20 | NULL | [] | NULL |
| dishwasher | 30 | false | [] | NULL |
| dryer | 30 | false | [] | NULL |
| front load washer | 20 | false | [] | NULL |
| microwave | 20 | false | [] | NULL |
| oven | 5 | true | [{"created":"2016-01-01","comment":"comment1"}] | NULL |
| refrigerator | 10 | false | [] | NULL |
| top load washer | 10 | true | [{"created":"2016-01-01","comment":"comment1"}] | NULL |
+----------------------+----------+--------------------+-------------------------------------------------+----------------+
This is the DetailedInventory table after you run the query:
+----------------------+----------+--------------------+-----------------------------------------------------------------------------------------------+----------------+
| product | quantity | supply_constrained | comments | specifications |
+----------------------+----------+--------------------+-----------------------------------------------------------------------------------------------+----------------+
| countertop microwave | 20 | NULL | [] | NULL |
| dishwasher | 30 | false | [] | NULL |
| dryer | 30 | false | [] | NULL |
| front load washer | 20 | false | [] | NULL |
| microwave | 20 | false | [{"created":"2016-02-01","comment":"comment2"}] | NULL |
| oven | 5 | true | [{"created":"2016-01-01","comment":"comment1"}] | NULL |
| refrigerator | 10 | false | [{"created":"2016-02-01","comment":"comment2"}] | NULL |
| top load washer | 10 | true | [{"created":"2016-01-01","comment":"comment1"},{"created":"2016-02-01","comment":"comment2"}] | NULL |
+----------------------+----------+--------------------+-----------------------------------------------------------------------------------------------+----------------+
Example 6
In the following example, the query increases the inventory of products from the
warehouse in CA. The products from other states are deleted, and any product
that is not present in the NewArrivals table is unchanged.
MERGE dataset.Inventory T
USING (SELECT product, quantity, state FROM dataset.NewArrivals t1 JOIN dataset.Warehouse t2 ON t1.warehouse = t2.warehouse) S
ON T.product = S.product
WHEN MATCHED AND state = 'CA' THEN
UPDATE SET quantity = T.quantity + S.quantity
WHEN MATCHED THEN
DELETE
These are the tables before you run the query:
Warehouse +--------------+-------+ | warehouse | state | +--------------+-------+ | warehouse #1 | WA | | warehouse #2 | CA | | warehouse #3 | WA | +--------------+-------+
NewArrivals +-----------------+----------+--------------+ | product | quantity | warehouse | +-----------------+----------+--------------+ | dryer | 20 | warehouse #2 | | refrigerator | 25 | warehouse #2 | | top load washer | 30 | warehouse #1 | +-----------------+----------+--------------+
Inventory +-----------------+----------+ | product | quantity | +-----------------+----------+ | dryer | 50 | | microwave | 20 | | oven | 35 | | refrigerator | 25 | | top load washer | 30 | +-----------------+----------+
This is the Inventory table after you run the query:
Inventory +--------------+----------+ | product | quantity | +--------------+----------+ | dryer | 70 | | microwave | 20 | | oven | 35 | | refrigerator | 50 | +--------------+----------+
Example 7
In the following example, a runtime error is returned because the query attempts
to update a target table when the source contains more than one matched row. To
resolve the error, you need to change the merge_condition or search_condition
to avoid duplicate matches in the source.
MERGE dataset.Inventory T
USING dataset.NewArrivals S
ON T.product = S.product
WHEN MATCHED THEN
UPDATE SET quantity = T.quantity + S.quantity
These are the tables before you run the query:
NewArrivals +-----------------+----------+--------------+ | product | quantity | warehouse | +-----------------+----------+--------------+ | dryer | 10 | warehouse #2 | | dryer | 20 | warehouse #1 | | refrigerator | 25 | warehouse #2 | | top load washer | 30 | warehouse #1 | +-----------------+----------+--------------+
Inventory +--------------+----------+ | product | quantity | +--------------+----------+ | dryer | 70 | | microwave | 20 | | oven | 35 | | refrigerator | 50 | +--------------+----------+
When you run the query, the following error is returned:
UPDATE/MERGE must match at most one source row for each target row
Example 8
In the following example, all of the products in the NewArrivals table are
replaced with values from the subquery. The INSERT clause does not specify
column names for either the target table or the source subquery.
MERGE dataset.NewArrivals
USING (SELECT * FROM UNNEST([('microwave', 10, 'warehouse #1'),
('dryer', 30, 'warehouse #1'),
('oven', 20, 'warehouse #2')]))
ON FALSE
WHEN NOT MATCHED THEN
INSERT ROW
WHEN NOT MATCHED BY SOURCE THEN
DELETE
This is the NewArrivals table before you run the query:
+-----------------+----------+--------------+ | product | quantity | warehouse | +-----------------+----------+--------------+ | dryer | 10 | warehouse #2 | | dryer | 20 | warehouse #1 | | refrigerator | 25 | warehouse #2 | | top load washer | 30 | warehouse #1 | +-----------------+----------+--------------+
This is the NewArrivals table after you run the query:
+-----------------+----------+--------------+ | product | quantity | warehouse | +-----------------+----------+--------------+ | microwave | 10 | warehouse #1 | | dryer | 30 | warehouse #1 | | oven | 20 | warehouse #2 | +-----------------+----------+--------------+
Tables used in examples
The example queries in this document use the following tables.
Inventory table
[
{"name": "product", "type": "string"},
{"name": "quantity", "type": "integer"},
{"name": "supply_constrained", "type": "boolean"}
]
DDL statement to create this table:
CREATE OR REPLACE TABLE
dataset.Inventory (product STRING,
quantity INT64,
supply_constrained BOOLEAN);
NewArrivals table
[
{"name": "product", "type": "string"},
{"name": "quantity", "type": "integer"},
{"name": "warehouse", "type": "string"}
]
DDL statement to create this table:
CREATE OR REPLACE TABLE
dataset.NewArrivals (product STRING,
quantity INT64,
warehouse STRING);
Warehouse table
[
{"name": "warehouse", "type": "string"},
{"name": "state", "type": "string"}
]
DDL statement to create this table:
CREATE OR REPLACE TABLE
dataset.Warehouse (warehouse STRING,
state STRING);
DetailedInventory table
[
{"name": "product", "type": "string"},
{"name": "quantity", "type": "integer"},
{"name": "supply_constrained", "type": "boolean"},
{"name": "comments", "type": "record", "mode": "repeated", "fields": [
{"name": "created", "type": "date"},
{"name": "comment", "type": "string"}
]},
{"name": "specifications", "type": "record", "fields": [
{"name": "color", "type": "string"},
{"name": "warranty", "type": "string"},
{"name": "dimensions", "type": "record", "fields": [
{"name": "depth", "type": "float"},
{"name": "height", "type": "float"},
{"name": "width", "type": "float"}
]}
]}
]
DDL statement to create this table:
CREATE OR REPLACE TABLE
dataset.DetailedInventory (product STRING,
quantity INT64,
supply_constrained BOOLEAN,
comments ARRAY<STRUCT<created DATE, comment STRING>>,
specifications STRUCT<color STRING, warranty STRING,
dimensions STRUCT<depth FLOAT64, height FLOAT64, width FLOAT64>>);