Class DataQualityRule (2.20.0)

DataQualityRule(mapping=None, *, ignore_unknown_fields=False, **kwargs)

A rule captures data quality intent about a data source.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

Attributes

Name Description
range_expectation google.cloud.dataplex_v1.types.DataQualityRule.RangeExpectation
Row-level rule which evaluates whether each column value lies between a specified range. This field is a member of oneof_ rule_type.
non_null_expectation google.cloud.dataplex_v1.types.DataQualityRule.NonNullExpectation
Row-level rule which evaluates whether each column value is null. This field is a member of oneof_ rule_type.
set_expectation google.cloud.dataplex_v1.types.DataQualityRule.SetExpectation
Row-level rule which evaluates whether each column value is contained by a specified set. This field is a member of oneof_ rule_type.
regex_expectation google.cloud.dataplex_v1.types.DataQualityRule.RegexExpectation
Row-level rule which evaluates whether each column value matches a specified regex. This field is a member of oneof_ rule_type.
uniqueness_expectation google.cloud.dataplex_v1.types.DataQualityRule.UniquenessExpectation
Row-level rule which evaluates whether each column value is unique. This field is a member of oneof_ rule_type.
statistic_range_expectation google.cloud.dataplex_v1.types.DataQualityRule.StatisticRangeExpectation
Aggregate rule which evaluates whether the column aggregate statistic lies between a specified range. This field is a member of oneof_ rule_type.
row_condition_expectation google.cloud.dataplex_v1.types.DataQualityRule.RowConditionExpectation
Row-level rule which evaluates whether each row in a table passes the specified condition. This field is a member of oneof_ rule_type.
table_condition_expectation google.cloud.dataplex_v1.types.DataQualityRule.TableConditionExpectation
Aggregate rule which evaluates whether the provided expression is true for a table. This field is a member of oneof_ rule_type.
sql_assertion google.cloud.dataplex_v1.types.DataQualityRule.SqlAssertion
Aggregate rule which evaluates the number of rows returned for the provided statement. If any rows are returned, this rule fails. This field is a member of oneof_ rule_type.
template_reference google.cloud.dataplex_v1.types.DataQualityRule.TemplateReference
Aggregate rule which references a rule template and provides the parameters to be substituted in the template. If any rows are returned, this rule fails. This field is a member of oneof_ rule_type.
column str
Optional. The unnested column which this rule is evaluated against.
ignore_null bool
Optional. Rows with null values will automatically fail a rule, unless ignore_null is true. In that case, such null rows are trivially considered passing. This field is only valid for the following type of rules: - RangeExpectation - RegexExpectation - SetExpectation - UniquenessExpectation
dimension str
Optional. The dimension a rule belongs to. Results are also aggregated at the dimension level. Custom dimension name is supported with all uppercase letters and maximum length of 30 characters.
threshold float
Optional. The minimum ratio of **passing_rows / total_rows** required to pass this rule, with a range of [0.0, 1.0]. 0 indicates default value (i.e. 1.0). This field is only valid for row-level type rules.
name str
Optional. A mutable name for the rule. - The name must contain only letters (a-z, A-Z), numbers (0-9), or hyphens (-). - The maximum length is 63 characters. - Must start with a letter. - Must end with a number or a letter.
description str
Optional. Description of the rule. - The maximum length is 1,024 characters.
suspended bool
Optional. Whether the Rule is active or suspended. Default is false.
attributes MutableMapping[str, str]
Optional. Map of attribute name and value linked to the rule. The rules to evaluate can be filtered based on attributes provided here and a filter expression provided in the DataQualitySpec.filter field.
rule_source google.cloud.dataplex_v1.types.DataQualityRule.RuleSource
Output only. Contains information about the source of the rule and its relationship with the BigQuery table, where applicable.
debug_queries MutableSequence[google.cloud.dataplex_v1.types.DataQualityRule.DebugQuery]
Optional. Specifies the debug queries for this rule. Currently, only one query is supported, but this may be expanded in the future.

Classes

AttributesEntry

AttributesEntry(mapping=None, *, ignore_unknown_fields=False, **kwargs)

The abstract base class for a message.

Parameters
Name Description
kwargs dict

Keys and values corresponding to the fields of the message.

mapping Union[dict, .Message]

A dictionary or message to be used to determine the values for this message.

ignore_unknown_fields Optional(bool)

If True, do not raise errors for unknown fields. Only applied if mapping is a mapping type or there are keyword parameters.

DebugQuery

DebugQuery(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Specifies a SQL statement that is evaluated to return up to 10 scalar values that are used to debug rules. If the rule fails, the values can help diagnose the cause of the failure.

The SQL statement must use GoogleSQL syntax <https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax>__, and must not contain any semicolons.

You can use the data reference parameter ${data()} to reference the source table with all of its precondition filters applied. Examples of precondition filters include row filters, incremental data filters, and sampling. For more information, see Data reference parameter <https://cloud.google.com/dataplex/docs/auto-data-quality-overview#data-reference-parameter>__.

You can also name results with an explicit alias using [AS] alias. For more information, see BigQuery explicit aliases <https://docs.cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#explicit_alias_syntax>__.

Example: SELECT MIN(col1) AS min_col1, MAX(col1) AS max_col1 FROM ${data()}

NonNullExpectation

NonNullExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Evaluates whether each column value is null.

RangeExpectation

RangeExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Evaluates whether each column value lies between a specified range.

RegexExpectation

RegexExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Evaluates whether each column value matches a specified regex.

RowConditionExpectation

RowConditionExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Evaluates whether each row passes the specified condition.

The SQL expression needs to use GoogleSQL syntax <https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax>__ and should produce a boolean value per row as the result.

Example: col1 >= 0 AND col2 < 10

RuleSource

RuleSource(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Represents the rule source information from Catalog.

SetExpectation

SetExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Evaluates whether each column value is contained by a specified set.

SqlAssertion

SqlAssertion(mapping=None, *, ignore_unknown_fields=False, **kwargs)

A SQL statement that is evaluated to return rows that match an invalid state. If any rows are are returned, this rule fails.

The SQL statement must use GoogleSQL syntax <https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax>__, and must not contain any semicolons.

You can use the data reference parameter ${data()} to reference the source table with all of its precondition filters applied. Examples of precondition filters include row filters, incremental data filters, and sampling. For more information, see Data reference parameter <https://cloud.google.com/dataplex/docs/auto-data-quality-overview#data-reference-parameter>__.

Example: SELECT * FROM ${data()} WHERE price < 0

StatisticRangeExpectation

StatisticRangeExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Evaluates whether the column aggregate statistic lies between a specified range.

TableConditionExpectation

TableConditionExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Evaluates whether the provided expression is true.

The SQL expression needs to use GoogleSQL syntax <https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax>__ and should produce a scalar boolean result.

Example: MIN(col1) >= 0

TemplateReference

TemplateReference(mapping=None, *, ignore_unknown_fields=False, **kwargs)

A rule that constructs a SQL statement to evaluate using a rule template and parameter values. If the constructed statement returns any rows, this rule fails

UniquenessExpectation

UniquenessExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Evaluates whether the column has duplicates.