Data product module creation
To define your own business logic and analytical models, create a custom data product module. This lets you run calculations on your foundation tables or upstream data products and package the results into deployable datasets.
Prerequisites
We recommend creating custom data product modules in a dedicated custom namespace for better lifecycle management. Also, ensure the source table you plan to use exists in the data foundation dataset.
Creation of a data product module
Data product module definition requires these steps:
- Registration of the data product module within the
config/config.yamlfile, by extending thedata.modules.productslist with entry:
[...]
data:
[...]
# Configuration for data foundation and product modules.
modules:
# List of foundation modules.
foundation:
[...]
# List of data product modules.
product:
[...]
- moduleId: product_module_id
type: custom_namespace.data_product_module_type
dependsOn:
sapModule: erp
sapModuleCustNS: foundation_module_id
dataTargetId: product_target
enabled: true
tableSettings: "table_settings.yaml"
# Optional, references file in `config/custom_namespace_path/data_product/data_product_module_type/`
# If omitted, defaults to src/data_modules/custom_namespace_path/data_product/table_settings.default.yaml.
[...]
- Creation of
tableSettingsfile (e.g.config/custom_namespace_path/data_product/data_product_module_type/table_settings.yaml).
This YAML controls table configurations like materializations and BigQuery optimization details:
common:
custom_sales_summary:
materialization_type: "table"
tags: ["custom", "sales", "reporting"]
partition_details:
column: "created_date"
partition_type: "date"
time_grain: "day"
cluster_details:
columns:
- "customer_id"
- Creation of annotation file
The annotation file <tablename>.yaml is created for each data product output artifacts (table, view) and describes columns and fields in YAML format. During compilation, the builder automatically search for annotations within the product's annotations/ folder (e.g. annotations/custom_sales_summary.yaml), merges these strings directly into the output Dataform schema definitions so they are preserved in BigQuery table metadata.
An annotation config/custom_namespace_path/data_product/data_product_module_type/annotations/'tablename'.yaml file has the format:
description: "Description of the table or view purpose"
fields:
- name: "customer_id" # column name
description: "Customer identifier" # column description
- name: "column2"
description: "Description of Column 2"
- name: "column3"
description: "Description of Column 3"
- Create a
manifest.yamlfile in your data product folderconfig/custom_namespace_path/data_product/data_product_module_type/, maintaining the type, tables and module dependencies. The manifest file follows this format:
type: sales_performance
builder: sap_product # Automatically resolves to the global SapProductBuilder fallback
dependencies:
sapModule:
type: sap
supportedVersions:
- ecc
- s4
Data product module example
For the flights example we are creating src/data_modules/custom_namespace_path/data_product/data_product_module_type/manifest.yaml with the content
type: data_product_module_type
dependencies:
sapModule:
type: cortex.sap
supported_versions:
- ecc
- s4
tables:
common:
- tcurr
sapModuleCustNS:
type: custom_namespace .sap
supported_versions:
- ecc
- s4
tables:
common:
- sflight
builder: sap_product
- In the next step, extend the referenced table settings file for data product tables.
In the used example, create: config/custom_namespace_path/data_product/data_product_module_type/table_settings.yaml with the content:
ecc:
flights_usd:
materializationType: incremental
tags: [sap, dataproduct, masterdata]
s4:
flights_usd:
materializationType: incremental
tags: [sap, dataproduct, masterdata]
- Create annotations for data product tables to enrich the storage schema with descriptions.
In the used example, create file: src/data_modules/custom_namespace_path/data_product/data_product_module_type/annotations/flights_usd.yaml with the content:
description: "Flight scheduling and pricing information, including currency conversion to USD."
fields:
- name: "client_mandt"
description: "Client (Mandant), PK"
- name: "airline_code_carrid"
description: "Airline Carrier ID, PK"
- name: "flight_connection_number_connid"
description: "Flight Number, PK"
- name: "flight_date_fldate"
description: "Flight Date"
- name: "price_usd"
description: "Price in USD"
- name: "price"
description: "Price in local currency"
- name: "currency"
description: "Local currency"
- The business logic of the data product is stored in
jsorsqlxfiles.
In given example create src/data_modules/custom_namespace_path/data_product/data_product_module_type/definitions/flights_usd.js file with the content:
// ___MODULE_CONTEXT___
// ___TABLE_CONFIG___
const moduleConfig = config.product[moduleContext.moduleId];
const sapModuleConfigDatasetId = moduleConfig.sources.sapModule.datasetId;
const sapModuleCustNSConfigDatasetId = moduleConfig.sources.sapModuleCustNS.datasetId;
const materializationType = tableConfig.materializationType || "incremental";
const incremental = require("includes/cortex/incremental.js");
const publish_config = require("includes/cortex/publish_config.js");
const publishConfig = publish_config.getPublishConfig(
materializationType,
tableConfig,
moduleConfig,
[
"client_mandt",
"airline_code_carrid",
"flight_connection_number_connid",
"flight_date_fldate"
]
);
publish("flight_usd", publishConfig).query(
(ctx) => `
WITH flight_base AS (
SELECT
mandt,
carrid,
connid,
fldate,
price,
currency,
-- Convert flight date string (YYYYMMDD) to an integer to calculate SAP's inverted date key
CAST(99999999 - CAST(fldate AS INT64) AS STRING) AS inverted_fldate
FROM ${ctx.ref(sapModuleCustNSConfigDatasetId, 'sflight')} AS flight
),
ranked_exchange_rates AS (
SELECT
f.mandt,
f.carrid,
f.connid,
f.fldate,
f.price,
f.currency,
t.ukurs,
-- Window function to grab the closest historical exchange rate
ROW_NUMBER() OVER (
PARTITION BY f.mandt, f.carrid, f.connid, f.fldate
ORDER BY t.gdatu ASC
) AS latest_rate_rank
FROM flight_base f
LEFT JOIN ${ctx.ref(sapModuleConfigDatasetId, 'tcurr')} AS t
ON f.mandt = t.mandt
AND t.kurst = 'M' -- 'M' is the standard SAP default for average exchange rates
AND t.fcurr = f.currency
AND t.tcurr = 'USD'
-- Chronological (rate_date <= flight_date) translates to (t.gdatu >= inverted_fldate)
AND t.gdatu >= f.inverted_fldate
)
SELECT
client_mandt,
airline_code_carrid,
flight_connection_number_connid,
flight_date_fldate,
price,
currency,
price_usd,
CURRENT_TIMESTAMP() AS bq_loaded_at
FROM (
SELECT
mandt AS client_mandt,
carrid AS airline_code_carrid,
connid AS flight_connection_number_connid,
PARSE_TIMESTAMP('%Y%m%d', fldate) AS flight_date_fldate,
price AS price,
currency AS currency,
-- Currency Conversion Logic
CASE
WHEN currency = 'USD' THEN price
WHEN ukurs IS NULL THEN NULL -- Handles cases where no exchange rate is found
-- If UKURS is negative, it's an indirect quotation (1 USD = X Local) -> Divide
WHEN ukurs < 0 THEN ROUND(price / ABS(ukurs), 2)
-- If UKURS is positive, it's a direct quotation (1 Local = X USD) -> Multiply
ELSE ROUND(price * ukurs, 2)
END AS price_usd
FROM ranked_exchange_rates
WHERE latest_rate_rank = 1
)
${incremental.getWhere(ctx, ["flight_date_fldate"])}
`
);
Verification of custom namespace extension
To verify the successful creation of Google Cloud Cortex Framework data product modules, follow these steps:
- Execute Cortex Framework deployment referencing the updated
config.yamlfile. - Follow the Post deployment steps to execute the Dataform actions and verify the results in BigQuery
- Previous step: Data foundation module creation
- Next step: Post deployment steps
- Back to Overview