Pembuatan modul produk data

Untuk menentukan logika bisnis dan model analisis Anda sendiri, buat modul produk data kustom. Tindakan ini memungkinkan Anda menjalankan perhitungan pada tabel dasar atau produk data upstream dan mengemas hasilnya ke dalam set data yang dapat di-deploy.

Prasyarat

Sebaiknya buat modul produk data kustom di namespace kustom khusus untuk pengelolaan siklus proses yang lebih baik. Selain itu, pastikan tabel sumber yang ingin Anda gunakan ada di set data dasar data .

Pembuatan modul produk data

Definisi modul produk data memerlukan langkah-langkah berikut:

  • Pendaftaran modul produk data dalam file config/config.yaml, dengan memperluas daftar data.modules.products dengan entri:
data:
  # Configuration for data foundation and product modules.
  modules:
    # List of data product modules.
    product:
        # Recommended naming for product_module_id:
        # custom_namespace_data_product_module_type
      - moduleId:  product_module_id
        # Type of the data product (namespaced).
        type:  custom_namespace.data_product_module_type
        # Map of module dependencies.
        dependsOn:
          sapModule: erp
          sapModuleCustNS:  foundation_module_id
        # Reference to the target dataset ID.
        dataTargetId: product_target
        # Whether the module is enabled.
        # enabled: true
        # Whether the foundation is external (does not create target dataset).
        # external: false
        # Custom table settings file, relative to 'config/' file directory
        # Recommended path: '{custom_namespace}/data_product/{data_product_module_type}/table_settings.yaml'
        # If omitted, defaults to '../src/data_modules/{custom_namespace}/data_product/{data_product_module_type}/table_settings.default.yaml'
        # tableSettings: "{custom_namespace}/data_product/{data_product_module_type}/table_settings.yaml"
        
  • Pembuatan file tableSettings default (misalnya, src/data_modules/custom_namespace/data_product/data_product_module_type/table_settings.default.yaml).

YAML ini mengontrol konfigurasi tabel seperti materialisasi dan detail pengoptimalan BigQuery:

common:
  custom_sales_summary:
    materialization_type: "table"
    tags: ["custom", "sales", "reporting"]
    partition_details:
      column: "created_date"
      partition_type: "date"
      time_grain: "day"
    cluster_details:
      columns:
        - "customer_id"
  • Pembuatan file anotasi

File anotasi tablename.yaml dibuat untuk setiap artefak output produk data (tabel, tampilan) dan menjelaskan kolom serta kolom dalam format YAML. Selama kompilasi, builder akan otomatis menelusuri anotasi dalam folder annotations/ produk (misalnya, src/data_modules/custom_namespace/data_product/data_product_module_type/annotations/custom_sales_summary.yaml), menggabungkan string ini langsung ke dalam definisi skema Dataform output sehingga dipertahankan dalam metadata tabel BigQuery.

File anotasi src/data_modules/custom_namespace/data_product/data_product_module_type/annotations/tablename.yaml memiliki format:

description: "Description of the table or view purpose"
fields:
  - name: "customer_id"                     # column name
    description: "Customer identifier"      # column description
  - name: "column2"
    description: "Description of Column 2"
  - name: "column3"
    description: "Description of Column 3"
  • Buat file manifest.yaml di folder produk data src/data_modules/custom_namespace/data_product/data_product_module_type/, dengan mempertahankan jenis, tabel, dan dependensi modul. File manifes mengikuti format ini:

type: sales_performance
builder: sap_product     # Automatically resolves to the global SapProductBuilder fallback
dependencies:
  sapModule:
    type: sap
    supportedVersions:
      - ecc
      - s4

Contoh modul produk data

Langkah-langkah untuk menerapkan produk data flights_usd di namespace: sap_bookingdatamodel dari contoh penerbangan adalah:

  • Pendaftaran modul produk data dalam file config/config.yaml, dengan memperluas daftar data.modules.products dengan entri:
data:
  modules:
    product:
      - moduleId: sap_bookingdatamodel_flights_usd
        type: sap_bookingdatamodel.flights_usd
        dependsOn:
          sapModule: erp
          sapModuleCustNS: sap_bookingdatamodel
        dataTargetId: product_target
  • Selanjutnya, buat src/data_modules/custom_namespace/data_product/data_product_module_type/manifest.yaml dengan konten
type: flights_usd
dependencies:
  sapModule:
    type: cortex.sap
    supported_versions:
      - ecc
      - s4
    tables:
      common:
        - tcurr
  sapModuleCustNS:
    # Type of the dependent Module.
    # use cortex.sap if you followed "Configure multiple instances of a data foundation module"
    # https://docs.cloud.google.com/cortex/docs/deployment-configuration#multiple-data-foundation-instances
    type: cortex.sap
    # use sap_bookingdatamodel.sap if you are connecting to custom-data foundation module:
    # https://docs.cloud.google.com/cortex/docs/extensibility-guide-data-foundation
    #type: sap_bookingdatamodel.sap
    supported_versions:
      - ecc
      - s4
    tables:
      common:
        - sflight
builder: sap_product
  • Pada langkah berikutnya, buat file setelan tabel yang direferensikan untuk mengonfigurasi skema dan metadata tabel atau tampilan output di BigQuery.

Dalam contoh yang digunakan, buat: src/data_modules/custom_namespace/data_product/data_product_module_type/table_settings.default.yaml dengan konten:

ecc:
  flights_usd:
    materializationType: incremental
    tags: [sap, dataproduct, masterdata]
s4:
  flights_usd:
    materializationType: incremental
    tags: [sap, dataproduct, masterdata]

  • Buat anotasi untuk tabel produk data guna memperkaya skema penyimpanan dengan deskripsi.

Dalam contoh yang digunakan, buat file: src/data_modules/custom_namespace/data_product/data_product_module_type/annotations/flights_usd.yaml dengan konten:

description: "Flight scheduling and pricing information, including currency conversion to USD."
fields:
  - name: "client_mandt"
    description: "Client (Mandant), PK"
  - name: "airline_code_carrid"
    description: "Airline Carrier ID, PK"
  - name: "flight_connection_number_connid"
    description: "Flight Number, PK"
  - name: "flight_date_fldate"
    description: "Flight Date"
  - name: "price_usd"
    description: "Price in USD"
  - name: "price"
    description: "Price in local currency"
  - name: "currency"
    description: "Local currency"
  • Logika bisnis produk data disimpan dalam file js atau sqlx.

Dalam contoh yang diberikan, buat file src/data_modules/custom_namespace/data_product/data_product_module_type/definitions/flights_usd.js dengan konten:

// ___MODULE_CONTEXT___
// ___TABLE_CONFIG___

const moduleConfig = config.product[moduleContext.moduleId];
const sapModuleConfigDatasetId = moduleConfig.sources.sapModule.datasetId;
const sapModuleCustNSConfigDatasetId = moduleConfig.sources.sapModuleCustNS.datasetId;

const materializationType = tableConfig.materializationType || "incremental";

const incremental = require("includes/cortex/incremental.js");
const publish_config = require("includes/cortex/publish_config.js");

const publishConfig = publish_config.getPublishConfig(
   materializationType,
   tableConfig,
   moduleConfig,
   [
       "client_mandt",
       "airline_code_carrid",
       "flight_connection_number_connid",
       "flight_date_fldate"
   ]
);

publish("flight_usd", publishConfig).query(
   (ctx) => `
WITH flight_base AS (
   SELECT
       mandt,
       carrid,
       connid,
       fldate,
       price,
       currency,
       -- Convert flight date string (YYYYMMDD) to an integer to calculate SAP's inverted date key
       CAST(99999999 - CAST(fldate AS INT64) AS STRING) AS inverted_fldate
   FROM   ${ctx.ref(sapModuleCustNSConfigDatasetId, 'sflight')} AS flight
),
ranked_exchange_rates AS (
   SELECT
       f.mandt,
       f.carrid,
       f.connid,
       f.fldate,
       f.price,
       f.currency,
       t.ukurs,
       -- Window function to grab the closest historical exchange rate
       ROW_NUMBER() OVER (
           PARTITION BY f.mandt, f.carrid, f.connid, f.fldate
           ORDER BY t.gdatu ASC
       ) AS latest_rate_rank
   FROM flight_base f
   LEFT JOIN ${ctx.ref(sapModuleConfigDatasetId, 'tcurr')} AS t
     ON f.mandt = t.mandt
    AND t.kurst = 'M'       -- 'M' is the standard SAP default for average exchange rates
    AND t.fcurr = f.currency
    AND t.tcurr = 'USD'
    -- Chronological (rate_date <= flight_date) translates to (t.gdatu >= inverted_fldate)
    AND t.gdatu >= f.inverted_fldate
)

SELECT
   client_mandt,
   airline_code_carrid,
   flight_connection_number_connid,
   flight_date_fldate,
   price,
   currency,
   price_usd,
   CURRENT_TIMESTAMP() AS bq_loaded_at
FROM (
  SELECT
    mandt              AS client_mandt,
    carrid             AS airline_code_carrid,
    connid             AS flight_connection_number_connid,
    PARSE_TIMESTAMP('%Y%m%d', fldate) AS flight_date_fldate,
    price              AS price,
    currency           AS currency,
    -- Currency Conversion Logic
    CASE
       WHEN currency = 'USD' THEN price
       WHEN ukurs IS NULL   THEN NULL -- Handles cases where no exchange rate is found
       -- If UKURS is negative, it's an indirect quotation (1 USD = X Local) -> Divide
       WHEN ukurs < 0       THEN ROUND(price / ABS(ukurs), 2)
       -- If UKURS is positive, it's a direct quotation (1 Local = X USD) -> Multiply
       ELSE ROUND(price * ukurs, 2)
     END AS price_usd
  FROM ranked_exchange_rates
  WHERE latest_rate_rank = 1
)
${incremental.getWhere(ctx, ["flight_date_fldate"])}
`
);

Verifikasi ekstensi namespace kustom

Untuk memverifikasi keberhasilan pembuatan modul produk data Google Cloud Cortex Framework, ikuti langkah-langkah berikut: