Class DataProfileSpec (2.20.0)

DataProfileSpec(mapping=None, *, ignore_unknown_fields=False, **kwargs)

DataProfileScan related setting.

Attributes

Name Description
sampling_percent float
Optional. The percentage of the records to be selected from the dataset for DataScan. - Value can range between 0.0 and 100.0 with up to 3 significant decimal digits. - Sampling is not applied if sampling_percent is not specified, 0 or 100.
row_filter str
Optional. A filter applied to all rows in a single DataScan job. The filter needs to be a valid SQL expression for a WHERE clause in BigQuery standard SQL syntax. Example: col1 >= 0 AND col2 < 10="">
post_scan_actions google.cloud.dataplex_v1.types.DataProfileSpec.PostScanActions
Optional. Actions to take upon job completion..
include_fields google.cloud.dataplex_v1.types.DataProfileSpec.SelectedFields
Optional. The fields to include in data profile. If not specified, all fields at the time of profile scan job execution are included, except for ones listed in exclude_fields.
exclude_fields google.cloud.dataplex_v1.types.DataProfileSpec.SelectedFields
Optional. The fields to exclude from data profile. If specified, the fields will be excluded from data profile, regardless of include_fields value.
catalog_publishing_enabled bool
Optional. If set, the latest DataScan job result will be published as Dataplex Universal Catalog metadata.
mode google.cloud.dataplex_v1.types.DataProfileSpec.Mode
Optional. The execution mode for the profile scan.

Classes

Mode

Mode(value)

Defines the execution mode for the profile scan.

    When this mode is selected, the following fields must not be
    set: `sampling_percent`, `row_filter`,
    `include_fields`, and `exclude_fields`.

PostScanActions

PostScanActions(mapping=None, *, ignore_unknown_fields=False, **kwargs)

The configuration of post scan actions of DataProfileScan job.

SelectedFields

SelectedFields(mapping=None, *, ignore_unknown_fields=False, **kwargs)

The specification for fields to include or exclude in data profile scan.