Deployment configuration samples

The config/config.yaml file — typically initialized from the config/config.yaml.example template — serves as the primary configuration for the Cortex Framework deployment. It defines critical parameters including the target Google Cloud execution project, source and destination BigQuery datasets, and Dataform specifications such as repository and workspace names.

The following sections provide a detailed breakdown of the config/config.yaml structure.

Build environment

The build environment project is the project that gets billed for build actions, such as BigQuery jobs (reading DD03L).

buildEnvironment:
  buildProjectId: YOUR_BUILD_PROJECT_ID

The following table describes the build environment parameters.

Parameter Meaning Default value Description
buildEnvironment.buildProjectId Build project ID YOUR_BUILD_PROJECT_ID Google Cloud Project ID where build operations are executed.

Data

The data: section of the configuration file defines your data sources, targets, and the specific modules for the data foundation and data products. Its general structure is as follows:

data:
   # Geographic location for BigQuery datasets (for example: US, EU, us-central1)
   # For full list see: https://docs.cloud.google.com/cortex/docs/supported-locations
  bigQueryLocation: US
  # List of namespaces for data foundation and product modules.
  namespaces:
    - name: cortex
      path: cortex
  # List of source datasets.
  sources:
    - ...
  # List of target datasets.
  targets:
    - ...

  # Configuration for data foundation and product modules.
  modules:
    # List of foundation modules.
    foundation:
    - ... 
    # List of data product modules.
    product:
    - ...

Data: BigQuery location

Defines the location of the BigQuery source and target datasets.

Parameter Meaning Default value Description
data.bigQueryLocation BigQuery Location US BigQuery dataset location (for example, US, us-central1, or europe-west1).

Data: Cortex namespace

Defines Cortex Framework namespace.

Parameter Meaning Default value Description
data.namespaces.name Namespace name - Cortex Framework namespace name. For example, cortex.
data.namespaces.path Namespace path - Cortex Framework namespace path for subdirectories used within src and config folder. For example, cortex.

Data: BigQuery sources and target datasets

The list of sources defines BigQuery datasets where the raw data from the source system has been replicated or streamed into.

The targets define a list of BigQuery datasets where the Dataform processed datasets will be stored.

Each of source and targets are referenced from the modules using its unique ID.

# Data source and target mapping
sources:
  - id: sap_raw
    projectId: YOUR_SOURCE_PROJECT_ID
    datasetId: cortex_sap_raw

targets:
  - id: sap_foundation
    projectId: YOUR_TARGET_PROJECT_ID
    datasetId: cortex7_sap_data_foundation

The following table describes the data source and target mapping parameters.

Parameter Meaning Default value Description
data.sources.id Source ID - Defines the 'id' for the source dataset to pull data from. For example, sap_raw.
data.sources.projectId Source Project ID YOUR_SOURCE_PROJECT_ID References the Google Cloud Project ID with source data.
data.sources.datasetId Source BigQuery Dataset ID - References the BigQuery Dataset ID with source data. For example, cortex_sap_raw.
data.targets.id Target ID - Defines the 'id' for the target dataset. For example, cortex_data_foundation.
data.targets.projectId Target Project ID YOUR_TARGET_PROJECT_ID References the Google Cloud Project ID for the target data.
data.targets.datasetId Target BigQuery Dataset ID - References the BigQuery Dataset ID for the target data. For example, cortex_sap_data_foundation.

Data: Modules

The modules define the structure and components of the Dataform data pipelines.

Data: Modules: Foundation

This section configures the data foundation layer modules that process data from the raw layer (CDC streams) into standardized latest records representation of the source data. In case the source provides a view on the latest records directly, or such transformations are performed by the source system connector, the module can be configured as an external data foundation source.

modules:
  # List of foundation modules.
  foundation:
    # Unique identifier for the module instance.
    - moduleId: erp
      # Type of the module (namespaced, for example, cortex.sap).
      type: cortex.sap
      # Reference to the source dataset ID.
      dataSourceId: sap_raw
      # Reference to the target dataset ID.
      dataTargetId: sap_foundation
      # Module-specific configuration settings.
      moduleSettings:
        # SAP version (for example, ecc, s4).
        sapVersion: ecc
        # SAP client number.
        mandt: "100"
      # Whether the module is enabled.
      # enabled: true
      # Whether the foundation is external (does not create target dataset).
      # external: false
      # Path to the table settings configuration file.
      # tableSettings: "config/data_foundation/sap/table_settings.yaml"

The following table describes the data foundation modules parameters for modules.foundationconfiguration.

Parameter Meaning Default value Description
moduleId Module Identifier erp Unique identifier for a specific data foundation transformation module instance.
type Module Logic Type cortex.sap Defines the business logic or template applied (for example, customers, sales_documents).
dataSourceId Source Link sap_raw References the 'id' from the data.sources list to pull data from.
dataTargetId Target Link sap_foundation References the 'id' from the targets list to push data to.
moduleSettings.sapVersion SAP System Version ecc Applicable for SAP data sources only. Determines source-specific logic for ecc (ECC) or s4 (S/4HANA) systems.
moduleSettings.mandt SAP Client (Mandant) 100 Applicable for SAP data sources only. The 3-digit SAP client identifier used to filter data rows.
enabled Module enablement true Specifies whether the module is enabled.
external External foundation false Specifies whether the foundation is external (does not create target dataset).
tableSettings Table settings config/cortex/data_foundation/{source_system}/table_settings.yaml Path to the table settings configuration file.

Data: Modules: Data products

Data product modules define the aggregations, calculations, and joins necessary to transform raw data into insights that fulfill specific business use cases.

The configuration of the data products allows setting of unique ID, definition of dependencies as well as reference of the data foundation module and target dataset where the results will be stored into.

Detailed configuration of given data products is defined within files referenced by the key: tableSettings.

modules:
  # List of data product modules.
  product:
    # Unique identifier for the data product instance.
    - moduleId: sap_purchasing_organizations
      # Type of the data product (namespaced).
      type: cortex.purchasing_organizations
      # Map of module dependencies.
      dependsOn:
        sapModule: erp
      # Reference to the target dataset ID.
      dataTargetId: product_target
      # Whether the module is enabled.
      # enabled: true
      # Path to the table settings configuration file.
      # tableSettings:   "config/cortex/data_product/purchasing_organizations/table_settings.yaml"

The following table describes the data product modules parameters for modules.product configuration.

Parameter Meaning Default value Description
moduleId Module Identifier - Unique identifier for a specific transformation module instance.
type Module Logic Type - Defines the business logic or template applied, defined in src/data_modules/{namespace}/data_product folder.
dataTargetId Target Link sap_foundation References the 'id' from the targets list to push data to.
dependsOn Upstream Dependency sapModule: erp Specifies which foundation module must exist before the product module can be built.
enabled Module enablement true Specifies whether the module is enabled.
tableSettings Table settings "config/{namespace}/data_product/data_product_name/table_settings.yaml" Path to the table settings configuration file.

Deployment environment

Cortex Framework uses Dataform to orchestrate SQL transformations within BigQuery. The deployment: block defines the Dataform configuration, responsible for the execution of the data pipelines, including the repository project, location, repository name, and the Dataform workspace name.

deployment:
  targets:
    - type: dataform
      enabled: true
      targetSettings:
        repositoryProjectId: YOUR_REPO_PROJECT_ID
        repositoryRegion: us-central1
        repositoryName: cortex-repository
        workspaceName: dev

The following table describes the deployment targets location parameters (deployment.targets:).

Google Cloud
Parameter Meaning Default Value Description
type Deployment type dataform Type of the deployment targets.
enabled Enabled/ Disabled true Specifies if given deployment target is enabled or disabled.
targetSettings.repositoryProjectId Repository project ID YOUR_REPO_PROJECT_ID The Google Cloud Project ID where the Dataform repository is managed.
targetSettings.repositoryRegion Repository region us-central1 The Google Cloud region for the Dataform repository (for example, us-central1 or europe-west1).
targetSettings.repositoryName Repository name cortex-repository The specific name of the Dataform repository.
targetSettings.workspaceName Workspace name dev The specific Dataform workspace used for the deployment cycle.