Deployment configuration samples
The config/config.yaml file — typically initialized from the
config/config.yaml.example template — serves as the primary configuration
for the Cortex Framework deployment. It defines critical
parameters including the target Google Cloud execution project,
source and destination BigQuery datasets, and Dataform
specifications such as repository and workspace names.
The following sections provide a detailed breakdown of the
config/config.yaml structure.
Build environment
The build environment project is the project that gets billed for build actions,
such as BigQuery jobs (reading DD03L).
buildEnvironment:
buildProjectId: YOUR_BUILD_PROJECT_ID
The following table describes the build environment parameters.
| Parameter | Meaning | Default value | Description |
|---|---|---|---|
buildEnvironment.buildProjectId |
Build project ID | YOUR_BUILD_PROJECT_ID |
Google Cloud Project ID where build operations are executed. |
Data
The data: section of the configuration file defines your data sources,
targets, and the specific modules for the data foundation and data products.
Its general structure is as follows:
data:
# Geographic location for BigQuery datasets (for example: US, EU, us-central1)
# For full list see: https://docs.cloud.google.com/cortex/docs/supported-locations
bigQueryLocation: US
# List of namespaces for data foundation and product modules.
namespaces:
- name: cortex
path: cortex
# List of source datasets.
sources:
- ...
# List of target datasets.
targets:
- ...
# Configuration for data foundation and product modules.
modules:
# List of foundation modules.
foundation:
- ...
# List of data product modules.
product:
- ...
Data: BigQuery location
Defines the location of the BigQuery source and target datasets.
| Parameter | Meaning | Default value | Description |
|---|---|---|---|
data.bigQueryLocation |
BigQuery Location | US |
BigQuery dataset location (for example, US, us-central1, or europe-west1).
|
Data: Cortex namespace
Defines Cortex Framework namespace.
| Parameter | Meaning | Default value | Description |
|---|---|---|---|
data.namespaces.name |
Namespace name | - | Cortex Framework namespace name. For example, cortex. |
data.namespaces.path |
Namespace path | - | Cortex Framework namespace path for subdirectories used within src and config folder. For example, cortex. |
Data: BigQuery sources and target datasets
The list of sources defines BigQuery datasets where the raw data from the source system has been replicated or streamed into.
The targets define a list of BigQuery datasets where the Dataform processed datasets will be stored.
Each of source and targets are referenced from the modules using its unique ID.
# Data source and target mapping
sources:
- id: sap_raw
projectId: YOUR_SOURCE_PROJECT_ID
datasetId: cortex_sap_raw
targets:
- id: sap_foundation
projectId: YOUR_TARGET_PROJECT_ID
datasetId: cortex7_sap_data_foundation
The following table describes the data source and target mapping parameters.
| Parameter | Meaning | Default value | Description |
|---|---|---|---|
data.sources.id |
Source ID | - |
Defines the 'id' for the source dataset to pull data from. For example, sap_raw. |
data.sources.projectId |
Source Project ID | YOUR_SOURCE_PROJECT_ID |
References the Google Cloud Project ID with source data. |
data.sources.datasetId |
Source BigQuery Dataset ID | - |
References the BigQuery Dataset ID with source data. For example, cortex_sap_raw. |
data.targets.id |
Target ID | - | Defines the 'id' for the target dataset. For example, cortex_data_foundation. |
data.targets.projectId |
Target Project ID | YOUR_TARGET_PROJECT_ID |
References the Google Cloud Project ID for the target data. |
data.targets.datasetId |
Target BigQuery Dataset ID | - |
References the BigQuery Dataset ID for the target data. For example, cortex_sap_data_foundation. |
Data: Modules
The modules define the structure and components of the Dataform data pipelines.
Data: Modules: Foundation
This section configures the data foundation layer modules that process data from the raw layer (CDC streams) into standardized latest records representation of the source data. In case the source provides a view on the latest records directly, or such transformations are performed by the source system connector, the module can be configured as an external data foundation source.
modules:
# List of foundation modules.
foundation:
# Unique identifier for the module instance.
- moduleId: erp
# Type of the module (namespaced, for example, cortex.sap).
type: cortex.sap
# Reference to the source dataset ID.
dataSourceId: sap_raw
# Reference to the target dataset ID.
dataTargetId: sap_foundation
# Module-specific configuration settings.
moduleSettings:
# SAP version (for example, ecc, s4).
sapVersion: ecc
# SAP client number.
mandt: "100"
# Whether the module is enabled.
# enabled: true
# Whether the foundation is external (does not create target dataset).
# external: false
# Path to the table settings configuration file.
# tableSettings: "config/data_foundation/sap/table_settings.yaml"
The following table describes the data foundation modules parameters for
modules.foundationconfiguration.
| Parameter | Meaning | Default value | Description |
|---|---|---|---|
moduleId |
Module Identifier | erp |
Unique identifier for a specific data foundation transformation module instance. |
type |
Module Logic Type | cortex.sap |
Defines the business logic or template applied (for example, customers, sales_documents). |
dataSourceId |
Source Link | sap_raw |
References the 'id' from the data.sources list to pull data from. |
dataTargetId |
Target Link | sap_foundation |
References the 'id' from the targets list to push data to. |
moduleSettings.sapVersion |
SAP System Version | ecc |
Applicable for SAP data sources only. Determines source-specific logic for ecc (ECC) or s4 (S/4HANA) systems. |
moduleSettings.mandt |
SAP Client (Mandant) | 100 |
Applicable for SAP data sources only. The 3-digit SAP client identifier used to filter data rows. |
enabled |
Module enablement | true |
Specifies whether the module is enabled. |
external |
External foundation | false |
Specifies whether the foundation is external (does not create target dataset). |
tableSettings |
Table settings | config/cortex/data_foundation/{source_system}/table_settings.yaml |
Path to the table settings configuration file. |
Data: Modules: Data products
Data product modules define the aggregations, calculations, and joins necessary to transform raw data into insights that fulfill specific business use cases.
The configuration of the data products allows setting of unique ID, definition of dependencies as well as reference of the data foundation module and target dataset where the results will be stored into.
Detailed configuration of given data products is defined within files referenced
by the key: tableSettings.
modules:
# List of data product modules.
product:
# Unique identifier for the data product instance.
- moduleId: sap_purchasing_organizations
# Type of the data product (namespaced).
type: cortex.purchasing_organizations
# Map of module dependencies.
dependsOn:
sapModule: erp
# Reference to the target dataset ID.
dataTargetId: product_target
# Whether the module is enabled.
# enabled: true
# Path to the table settings configuration file.
# tableSettings: "config/cortex/data_product/purchasing_organizations/table_settings.yaml"
The following table describes the data product modules parameters for
modules.product configuration.
| Parameter | Meaning | Default value | Description |
|---|---|---|---|
moduleId |
Module Identifier | - | Unique identifier for a specific transformation module instance. |
type |
Module Logic Type | - | Defines the business logic or template applied, defined in src/data_modules/{namespace}/data_product folder. |
dataTargetId |
Target Link | sap_foundation |
References the 'id' from the targets list to push data to. |
dependsOn |
Upstream Dependency | sapModule: erp |
Specifies which foundation module must exist before the product module can be built. |
enabled |
Module enablement | true |
Specifies whether the module is enabled. |
tableSettings |
Table settings | "config/{namespace}/data_product/data_product_name/table_settings.yaml" |
Path to the table settings configuration file. |
Deployment environment
Cortex Framework uses Dataform to orchestrate SQL
transformations within BigQuery. The deployment:
block defines the Dataform configuration, responsible for the execution
of the data pipelines, including the repository project, location, repository
name, and the Dataform workspace name.
deployment:
targets:
- type: dataform
enabled: true
targetSettings:
repositoryProjectId: YOUR_REPO_PROJECT_ID
repositoryRegion: us-central1
repositoryName: cortex-repository
workspaceName: dev
The following table describes the deployment targets location parameters
(deployment.targets:).
| Parameter | Meaning | Default Value | Description | Google Cloud
|---|---|---|---|
type |
Deployment type | dataform |
Type of the deployment targets. |
enabled |
Enabled/ Disabled | true |
Specifies if given deployment target is enabled or disabled. |
targetSettings.repositoryProjectId |
Repository project ID | YOUR_REPO_PROJECT_ID |
The Google Cloud Project ID where the Dataform repository is managed. |
targetSettings.repositoryRegion |
Repository region | us-central1 |
The Google Cloud region for the Dataform repository (for example, us-central1 or europe-west1). |
targetSettings.repositoryName |
Repository name | cortex-repository |
The specific name of the Dataform repository. |
targetSettings.workspaceName |
Workspace name | dev |
The specific Dataform workspace used for the deployment cycle. |