Data foundation module creation

The creation of a data foundation module is required for processing supplementary raw tables and including them in the data foundation dataset.

When creating a custom data foundation module, we recommend using a dedicated custom namespace to package it. Additionally, ensure the source table you plan to process exists in the raw layer dataset.

When working on SAP data foundation modules, ensure that the DD03L table has been replicated in the raw dataset—configured as the source for your foundation module in the config/config.yaml-file. Ensure also, that the replicated DD03L table contains the field metadata records for any tables you plan to ingest (for example, the custom or supplementary tables like sflight). The Cortex Framework build scripts and dependency resolver read these metadata rows to identify column lists, data types, and primary key relationships between tables.

Data foundation module definition

For the definition of the data foundation module, follow these steps:

  • In the config/config.yaml file, add the target dataset configuration to data.targets. This will prepare a dedicated BigQuery dataset into which the foundation tables will be deployed:
[...]
data:
  [...]
  targets:
    - id: data_foundation_sap_custom_namespace
      # Google Cloud Project ID for the target dataset.
      projectId:  target_project_id 
      # BigQuery dataset ID for the target.
      datasetId: data_foundation_sap_custom_namespace   
  • To define the data foundation module, add the following into modules.foundation section within the config/config.yaml file:
[...]
data:
  [...]
  modules:
    foundation:
      [...]
      - moduleId:  foundation_module_id 
        type: cortex.sap
        dataSourceId: sap_raw_s4
        dataTargetId: data_foundation_sap_custom_namespace
        moduleSettings:
          sapVersion: s4
          mandt: "100"
        tableSettings: "table_settings.yaml"
        # Optional. Path to custom table settings configuration relative to this config file, e.g., `config/custom_namespace_path/data_foundation/sap/table_settings.yaml`
        # If omitted, defaults to src/data_modules/cortex/data_foundation/sap/table_settings.default.yaml. 

  • Alternative: if using an external processed CDC table as the data foundation, adjust the modules.foundation section within the config/config.yaml file with external: true and remove targetDataSetID
[...]
data:
  [...]
  modules:
    foundation:
      [...]
      - moduleId:  foundation_module_id 
        type: cortex.sap
        dataSourceId: sap_raw_s4
        external: true
        moduleSettings:
          sapVersion: s4
          mandt: "100"
        tableSettings: "table_settings.yaml"
        # Optional. Path to custom table settings configuration relative to this config file, e.g., `config/custom_namespace_path/data_foundation/sap/table_settings.yaml`
        # If omitted, defaults to src/data_modules/cortex/data_foundation/sap/table_settings.default.yaml. 

  • Create the referenced table_settings.yaml defining which tables from the raw layer dataset will be transformed into the data foundation layer.
common:
  - source:
      tableName: custom_sap_table_name
    target:
      tags: [sap, s4, hourly]
      clusterDetails:
        columns: [carrid, connid]
      partitionDetails:
        column: fldate
        partitionType: time
        timeGrain: day

Data foundation module example

In the following example, we will use the previously defined namespace custom_namespace to create a new custom data foundation module that processes the sflight table.

Register new data foundation module.

data:
  targets:
    - id: data_foundation_sap_bookingdatamodel
      projectId:  target_project_id
      datasetId: data_foundation_sap_bookingdatamodel   
  modules:
    foundation:
      - moduleId:  sap_bookingdatamodel
        type: sap_bookingdatamodel.sap
        dataSourceId: sap_raw_s4
        dataTargetId: data_foundation_sap_sap_bookingdatamodel
        moduleSettings:
          sapVersion: s4
          mandt: "100"
        tableSettings: "table_settings.yaml"

Create table_settings.yaml

Create the file: config/sap_bookingdatamodel/data_foundation/sap/table_settings.yaml with this content:

common:
  - source:
      tableName: sflight
    target:
      tags: [sap, s4, hourly]
      clusterDetails:
        columns: [carrid, connid]
      partitionDetails:
        column: fldate
        partitionType: time
        timeGrain: day

Since the table has the same layout for SAP ECC and S/4HANA dialects, it's a child-element of the section common:.

Create annotations files for each table, used to enrich the schemas with metadata.

In the sample scenario, we are creating src/data_modules/sap_bookingdatamodel/data_foundation/sap/annotations/sflight.yaml, with this content:

description: "transparent table FLIGHT, part of the Basis Components Module, BC-DWB-TND (Training and Demo)."
fields:
- name: "mandt"
  description: "Client"
- name: "carrid"
  description: "Airline Code"
- name: "connid"
  description: "Flight Connection Number"
- name: "fldate"
  description: "Flight date"
- name: "price"
  description: "Airfare"
- name: "currency"
  description: "Local currency of airline"
- name: "planetype"
  description: "Aircraft Type"
- name: "seatsmax"
  description: "Maximum capacity in economy class"
- name: "seatsocc"
  description: "Occupied seats in economy class"
- name: "paymentsum"
  description: "Total of current bookings"
- name: "seatsmax_b"
  description: "Maximum capacity in business class"
- name: "seatsocc_b"
  description: "Occupied seats in business class"
- name: "seatsmax_f"
  description: "Maximum capacity in first class"
- name: "seatsocc_f"
  description: "Occupied seats in first class"

To verify that the custom data foundation module compiles and deploys successfully, refer to the Verification section in the data product extensibility page.