Test tables with assertions

This document shows you how to use Dataform core to create Dataform table assertions and test your workflow code.

About assertions

An assertion is a data quality test query that finds rows that violate one or more conditions specified in the query. If the query returns any rows, the assertion fails. Dataform runs assertions every time it updates your SQL workflow and it alerts you if any assertions fail.

Dataform automatically creates views in BigQuery that contain the results of compiled assertion queries. As configured in your workflow settings file, Dataform creates these views in an assertions schema where you can inspect assertion results.

For example, for the default dataform_assertions schema, Dataform creates a view in BigQuery in the following format: dataform_assertions.assertion_name.

You can create assertions for all Dataform table types: tables, incremental tables, views, and materialized views.

You can create assertions in the following ways:

Before you begin

  1. In the Google Cloud console, go to the Dataform page.

    Go to the Dataform page

  2. Select or create a repository.

  3. Select or create a development workspace.

  4. Create a table.

Required roles

To get the permissions that you need to create assertions, ask your administrator to grant you the Dataform Editor (roles/dataform.editor) IAM role on workspaces. For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

Create built-in assertions

You can add built-in Dataform assertions to the config block of a table. Dataform runs these assertions after table creation. After Dataform creates the table, you can see if the assertion passed in the Workflow execution logs tab of your workspace.

You can create the following assertions in the config block of a table:

  • nonNull

    This condition asserts that the specified columns are not null across all table rows. This condition is used for columns that can never be null.

    The following code sample shows a nonNull assertion in the config block of a table:

config {
  type: "table",
  assertions: {
    nonNull: ["user_id", "customer_id", "email"]
  }
}
SELECT ...
  • rowConditions

    This condition asserts that all table rows follow the custom logic you define. Each row condition is a custom SQL expression, and each table row is evaluated against each row condition. The assertion fails if any table row results in false.

    The following code sample shows a custom rowConditions assertion in the config block of an incremental table:

config {
  type: "incremental",
  assertions: {
    rowConditions: [
      'signup_date is null or signup_date > "2022-08-01"',
      'email like "%@%.%"'
    ]
  }
}
SELECT ...
  • uniqueKey

    This condition asserts that, in a specified column, no table rows have the same value.

    The following code sample shows a uniqueKey assertion in the config block of a view:

config {
  type: "view",
  assertions: {
    uniqueKey: ["user_id"]
  }
}
SELECT ...
  • uniqueKeys

    This condition asserts that, in the specified columns, no table rows have the same value. The assertion fails if there is more than one row in the table with the same values for all the specified columns.

    The following code sample shows a uniqueKeys assertion in the config block of a table:

config {
  type: "table",
  assertions: {
    uniqueKeys: [["user_id"], ["signup_date", "customer_id"]]
  }
}
SELECT ...

Add assertions to the config block

To add assertions to the config block of a table, follow these steps:

  1. In your development workspace, in the Files pane, select a table definition SQLX file.
  2. In the config block of the table file, enter assertions: {}.
  3. Inside assertions: {}, add your assertions.
  4. Optional: Click Format.

The following code sample shows the conditions added in the config block:

config {
  type: "table",
  assertions: {
    uniqueKey: ["user_id"],
    nonNull: ["user_id", "customer_id"],
    rowConditions: [
      'signup_date is null or signup_date > "2019-01-01"',
      'email like "%@%.%"'
    ]
  }
}
SELECT ...

Create manual assertions with SQLX

Manual assertions are SQL queries that you write in a dedicated SQLX file. A manual assertion SQL query must return zero rows. If the query returns rows when executed, the assertion fails.

To add manual assertions in a new SQLX file, follow these steps:

  1. In the Files pane, next to definitions/, click the More menu.
  2. Click Create file.
  3. In the Add a file path field, enter the name of the file followed by .sqlx. For example, definitions/custom_assertion.sqlx.

    Filenames can only include numbers, letters, hyphens, and underscores.

  4. Click Create file.

  5. In the Files pane, click the new file.

  6. In the file, enter:

    config {
      type: "assertion"
    }
    
  7. Below the config block, write your SQL query or multiple queries.

  8. Optional: Click Format.

The following code sample shows a manual assertion in a SQLX file that asserts that fields A, B, and c are never NULL in sometable:

config { type: "assertion" }

SELECT
  *
FROM
  ${ref("sometable")}
WHERE
  a IS NULL
  OR b IS NULL
  OR c IS NULL

What's next