Data foundation

Cortex Framework data foundation layer is a standardized, clean representation of the latest records of the source data and feeds the data product layer. This layer is updated in an incremental way for CDC-enabled sources and uses views for non-CDC enabled as well as externally implemented CDC sources. The implementation adapts to the source system's capabilities:

  • For CDC-enabled sources (Cortex Framework managed CDC): A dedicated Dataform pipeline incrementally processes raw layer logs into a continuously updated, persisted "Current State" table. The data foundation layer transforms these incremental changes into rows representing the current state of the source system dataset, powering data products, downstream analytics, and AI agents.

    Beyond the CDC processing, the flexible architecture allows bypassing built-in CDC processing and connecting other established CDC pipelines directly to the foundation layer.

  • For non CDC-enabled sources (External CDC): For sources where the replication tool or source system does the CDC, Cortex Framework will skip the pipelines for CDC and use the landing zone dataset as the source for feeding the data products. It acts as a view based, semantic abstraction layer and applies on-the-fly cleansing, shielding downstream data products from schema changes.

Cortex Framework data foundation layer also supports dynamic table schemas, allowing automatic ingestion of custom fields present in the raw layer without requiring manual code changes to the underlying SQL models.

Additionally, to bridge the gap between technical data and business users, Cortex Framework data foundation layer uses an extensive library of annotations (src/data_foundation/{foundation_name}/annotations/) to enhance the usability by adding human readable descriptions to the table schema. For example, during the build process, Cortex Framework adds for a cryptic SAP table column like bukrs the description with readable business semantics like Company Code.

Supported source systems

Cortex Framework data foundation layer supports the following source systems:

Raw replicated data from SAP ERP:

  • SAP ECC
  • SAP S/4HANA

For more information, see the prepare source data documentation for SAP ERP.