derived_analytic_model

Usage

view: view_name {
  derived_analytic_model: {
    sql: analytic_model_definition ;;
  }
}
Hierarchy
derived_analytic_model
Default Value
None

Special Rules
Analytic models are supported only for BigQuery and Snowflake connections.

Definition

For BigQuery and Snowflake connections, the derived_analytic_model parameter defines an in-database analytic model (a BigQuery Graph or a semantic view in Snowflake), that is managed by Looker. In this scenario, Looker generates the analytic model within your database by executing the appropriate SQL Data Definition Language (DDL) statements that you specify in the definition of the derived_analytic_model LookML parameter. The SQL syntax that you define in the derived_analytic_model parameter must be supported by your database.

Unlike derived tables, Looker-managed analytic model objects don't persist any data within the database and aren't incrementally refreshed. Instead, they represent semantic models that define relationships and measures directly in the database.

To define an analytic model, use one of the following subparameters of the derived_analytic_model parameter:

In addition, if you define your analytic view with the sql subparameter, you can use the publish_as_db_analytic_model subparameter of the derived_analytic_model parameter to create a stable analytic model that can be queried outside of Looker.

After you define the analytic model inside the derived_analytic_model parameter, you can define LookML dimensions and measures that map to your analytic model. See the Examples section for examples.

sql

Use the sql parameter if you want to provide the SQL for only the definition of the analytic model and have Looker manage the creation of the analytical model. When you use the sql subparameter, don't include a CREATE or a CREATE OR REPLACE statement, because Looker will automatically generate the DDL statement to create the analytic model on the database side.

See Creating a derived analytic model with sql for an example of using the sql parameter to create an analytic model on your database.

sql_create

Use the sql_create parameter to define a full SQL statement to create an analytic model. When you use the sql_create parameter, you need to include a CREATE OR REPLACE statement (or a CREATE statement, if your dialect doesn't support CREATE OR REPLACE).

Note the following when you use the sql_create subparameter:

  • For BigQuery connections, use a CREATE OR REPLACE statement to create the analytic model.
  • Use ${SQL_TABLE_NAME} to substitute in the computed name of the analytic model being created. This ensures that the SQL statement will correctly include the analytic model name that you provide in the LookML view parameter.

See Creating a derived analytic model with sql_create for an example of using the sql_create parameter to create an analytic model on your database.

create_process

Use the create_process parameter when you need to define multiple, sequential SQL statements to define the analytic model. Under the create_process parameter, use the sql_step subparameter to specify the individual SQL statements. Your database will execute the sql_step statements one at a time, in the order that you specified them. Looker issues the SQL statements in the sql_step subparameters as you define them, with no wrapper, which means that you have to include a step with a CREATE OR REPLACE statement (or a CREATE statement, if your dialect doesn't support CREATE OR REPLACE).

See Creating a derived analytic model with create_process for an example of using the create_process parameter to create an analytic model on your database.

publish_as_db_analytic_model

For derived analytic models that are created with the sql parameter, you can define your derived analytic model with publish_as_db_analytic_model: yes to prompt Looker to create a stable analytic model that can be queried outside of Looker.

The stable analytic model will be published (created) on the next cycle of the Looker regenerator after the derived analytic model's LookML is deployed to production with publish_as_db_analytic_model: yes.

See the Accessing the stable analytic model section for information about getting the name of the stable analytic model so that you can use the name to query the stable analytic model outside of Looker.

Create LookML dimensions and measures based on your analytic view

After you define your analytic model, in the same view file you can then define LookML dimensions and measures that are based on the analytic model.

See your dialect's documentation for information about the proper syntax to use for defining your analytic model, and for referring to elements in your analytic model. For example, to create a LookML dimension from a BigQuery Graph entity, you must use underscores to separate elements when scoping. For example, for BigQuery Graph, this LookML dimension is based on the location_id property in the Stores node table:

  dimension: location_id {
    type: number
    sql: Stores_location_id ;;
  }

However, to create a LookML dimension that's based on a Snowflake semantic view, you must use the unqualified name of a metric or dimension.

Examples

The following sections provide examples of creating an analytic view by using the different subparameters of derived_analytic_model:

Creating a derived analytic model with sql

The following is an example LookML view file that defines a SQL-based analytic model for a BigQuery database using the sql subparameter of derived_analytic_model. Looker will create the analytic model within the database by executing the SQL DDL commands that are provided in the sql parameter.

Note the following in the example:

  • The sql subparameter contains only the definition of the analytic model itself. There is no CREATE statement, because with sql Looker automatically handles the CREATE commands for the analytic model.
  • The analytic model is defined with publish_as_db_analytic_model: yes, so Looker will create a stable analytic model that can be queried outside of Looker.
view: MyWarehouseOrdersView {
  derived_analytic_model: {
    publish_as_db_analytic_model: yes

    # Defining the analytic model
    
    sql:
      NODE TABLES (
        Customers
        KEY(customer_id)
        PROPERTIES(
          country_code,
          concat(first_name, ' ', last_name) AS name,
          age,
          MEASURE(AVG(age)) AS AvgAge 
        ),
      Orders
        KEY(order_id)
        PROPERTIES (
          customer_id,
          employee_id,
          date,
          discount,
          MEASURE(AVG(discount)) AS AvgDiscount
        )

      EDGE TABLES (
        -- Relationship: Orders -> Customers
        looker_test.orders AS orders_to_users
          KEY(id)
          SOURCE KEY (order_id) REFERENCES orders (order_id)
          DESTINATION KEY (customer_id) REFERENCES Customers (customer_id)
          NO PROPERTIES
      ) ;;
  }

  # Mapping dimensions/measures to the dimensions/measures
  # provided by the analytic model

  dimension: customer_id {
    type: number
    sql: Customers_customer_id ;;
  }

  dimension: customer_age {
    type: number
    sql: Customers_age ;;
  }

  measure: orders_avg_discount {
    type: number
    sql: Orders_AvgDiscount ;; 
  }
}

Creating a derived analytic model with create_process

The following is an example LookML view file that defines a SQL-based analytic model for a BigQuery database by using the create_process subparameter of derived_analytic_model. In this example, you need to define multiple, sequential SQL statements to define the analytic model. The first step drops the analytic model if it already exists, and the second step creates the analytic model.

view: university_statistics {
  derived_analytic_model: {
    create_process: {
      sql_step: 
        DROP PROPERTY GRAPH IF EXISTS ${SQL_TABLE_NAME} ;;
      sql_step: 
        CREATE PROPERTY GRAPH ${SQL_TABLE_NAME}
         NODE TABLES (
          university.College
            KEY(college_id)
            PROPERTIES(college_id, college_name),
          university.Department
            KEY(dept_id)
            PROPERTIES(dept_id, dept_name, college_id,
              budget OPTIONS(description="Department budget in USD"),
              MEASURE(SUM(budget)) AS total_budget),
          university.Course
            KEY(course_id)
            PROPERTIES(
              course_id,
              course_name,
              credits,
              dept_id,
              MEASURE(AVG(credits)) AS avg_credits,
              MEASURE(SUM(credits)) AS total_credits,
              MEASURE(COUNT(course_id)) AS course_count)
        )
        EDGE TABLES (
          university.Department AS CollegeDept
            SOURCE KEY (college_id) REFERENCES College (college_id)
            DESTINATION KEY (dept_id) REFERENCES Department (dept_id),
          university.Course AS DeptCourse
            SOURCE KEY (dept_id) REFERENCES Department (dept_id)
            DESTINATION KEY (course_id) REFERENCES Course (course_id)
        );;
    }
  }

  # Mapping dimensions/measures to the dimensions/measures
  # provided by the analytic model

  dimension: college_id {
    type: number
    sql: College_college_id ;;
  }

  dimension: course_name {
    type: string
    sql: Course_course_name ;;
  }
  
  ...
}

Creating a derived analytic model with sql_create

The following is an example LookML view file that defines a SQL-based analytic model for a BigQuery database by using the sql_create subparameter of derived_analytic_model. In this example, the sql_create parameter defines the full CREATE OR REPLACE statement to execute to create the analytic model in a single step.

view: MyWarehouseOrdersView {
  derived_analytic_model: {
    sql_create:
      CREATE OR REPLACE PROPERTY GRAPH ${SQL_TABLE_NAME}
        NODE TABLES(
          accounting.Loan AS Loan
            KEY(loanId)
            LABEL Loan PROPERTIES(
              loanId,
              loanAmount,
              balance,
              createTime,
              interestRate,
              accountId,
              balance + 100 AS derived_balance,
              CASE WHEN balance > 1000 THEN "High" ELSE "Low" END AS risk_level,
              CONCAT("ID-", CAST(loanId AS STRING)) AS full_id,
              DATE(2024, 1, 1) AS fixed_date,
              MEASURE(AVG(interestRate)) AS avg_interest_rate
            ),
          accounting.AccountView AS Account
            KEY(accountId)
            LABEL Account PROPERTIES(
              accountId,
              createTime,
              isBlocked,
              accountType,
              amount,
              ownerId,
              MEASURE(MIN(createTime)) AS oldest_account_create_time,
              MEASURE(MAX(createTime)) AS newest_account_create_time,
              MEASURE(AVG(amount)) AS avg_account_amount,
              MEASURE(SUM(amount)) AS total_account_amount,
              MEASURE(COUNT(DISTINCT accountType)) AS account_type_count
            ),
          accounting.PersonMV AS Person
            KEY(personId)
            LABEL Person PROPERTIES(
              personId,
              personName,
              age,
              age_tier,
              MEASURE(AVG(age)) AS avg_age,
              MEASURE(COUNT(DISTINCT age_tier)) AS age_tier_count
            )
        )
        EDGE TABLES(
          accounting.Loan AS Account_Repay_Loan
            KEY(loanId)
            SOURCE KEY(loanId) REFERENCES Loan(loanId)
            DESTINATION KEY(accountId) REFERENCES Account(accountId)
            LABEL Repay NO PROPERTIES,
          accounting.Account AS Person_Own_Account
            KEY(accountId)
            SOURCE KEY(accountId) REFERENCES Account(accountId)
            DESTINATION KEY(ownerId) REFERENCES Person(personId)
            LABEL Own NO PROPERTIES
        );;
  }

  # Mapping dimensions/measures to the dimensions/measures
  # provided by the analytic model

  dimension: loan_id {
    type: number
    sql: Loan_loanId ;;
  }

  dimension: account_ID {
    type: number
    sql: Account_accountID ;;
  }

  ...

}

Accessing the stable analytic model

If you created your derived analytic model by using the sql subparameter and you included the publish_as_db_analytic_model: yes statement under your derived_analytic_model parameter, Looker will publish (create) the stable analytic model on the next cycle of the Looker regenerator after the derived analytic model's LookML is deployed to production with publish_as_db_analytic_model: yes.

When the stable analytic model is published, you can query it directly by using its stable name. You can determine the stable name from the information that is included on the SQL tab in the Data section of an Explore query of the analytic model. Follow these steps to get the stable name for an analytic model:

  1. Open the Explore for your analytic model's view.

  2. In the Explore, select any dimensions or measures from the field picker.

  3. Click the SQL tab of the Data section.

  4. In the SQL tab, locate one of the following SQL statements:

    • For BigQuery Graph:
      • CREATE PROPERTY GRAPH
      • SELECT ... FROM GRAPH_EXPAND('PROPERTY_GRAPH_NAME')
    • For Snowflake semantic view:
      • CREATE SEMANTIC VIEW
      • SELECT ... FROM SEMANTIC_VIEW_NAME
  5. The stable name is a view that Looker creates in the scratch schema, which points to the actual obfuscated table shown in the SQL tab. To get the stable name for the analytic view, fill in the following information from the SQL statement:

    SCRATCH_SCHEMA_NAME.CONNECTION_REGISTRATION_KEY_MODEL_NAME_VIEW_NAME
    
    • SCRATCH_SCHEMA_NAME: The scratch schema name is the beginning of the string following the CREATE or SELECT statement, before the "."
    • CONNECTION_REGISTRATION_KEY: The connection registration key is two characters; depending on your database dialect, it will follow either a dollar sign or the first underscore in the table name in the CREATE or SELECT statement.
    • MODEL_NAME: The name of the LookML model.
    • VIEW_NAME: The name of the view where the analytic model is defined.

For example, here is the text from the SQL tab of an Explore query for a BigQuery connection. The analytic model is defined in the view that's named sales_analytic_model, and the name of the LookML model is thelook. In this case, Looker has already created the analytic model, so there is no CREATE statement. But the SELECT ... FROM GRAPH_EXPAND statement contains the table name information:

-- use existing sales_analytic_model in `looker-test-db.looker_scratch.LG_J7LSZ1778710001008_sales_analytic_model`
SELECT
    sales_analytic_model.orders_id  AS sales_analytic_model_orders_id,
    AGG(sales_analytic_model.orders_count_orders ) AS sales_analytic_model_count_orders
FROM GRAPH_EXPAND("looker-test-db.looker_scratch.LG_J7LSZ1778710001008_sales_analytic_model") AS sales_analytic_model
GROUP BY
    1
ORDER BY
    2 DESC
LIMIT 500

Here are the values that you need to derive the stable name of the analytic model:

  • SCRATCH_SCHEMA_NAME is looker-test-db.looker_scratch
  • CONNECTION_REGISTRATION_KEY is J7
  • MODEL_NAME is thelook
  • VIEW_NAME is sales_analytic_model

Therefore, the stable name for the analytic model is the following:

looker-test-db.looker_scratch.J7_thelook_sales_analytic_model

Once you have the stable name of the analytic model, you can query the analytic model directly.

Things to consider

When using in-database analytic models, keep the following considerations and limitations in mind:

  • Data types: Only the following data types for dimensions and measures are supported with analytic models:

    • Supported for dimensions and measures:
      • string
      • number
      • date
      • yesno
    • Supported for dimensions only:
      • time
      • date_time
  • Measures:

    • Base measures must be predefined: Base measures must be predefined in the underlying database analytic model. Looker can't define a new base measure by performing an aggregation (such as type: sum or type: count) on a dimension from an analytic model.
    • Measures that are based on other measures are supported: You can use the sql parameter of a LookML measure to perform non-aggregate computations that use predefined base measures from the analytic model. When you create a measure that's based on other measures, you can't define the new measure as an aggregate measure type such as sum or count. You must define the new measure as a non-aggregate measure type, such as string, number, date, or yesno. See the following example:

      measure: average_order_amount {
        type: number
        sql: ROUND(${total_order_amount} / NULLIF(${count_orders}, 0), 2) ;;
      }
      
  • Joins: An Explore whose base view is based on an analytic model can't include any joins. Similarly, a view that's based on an analytic model can't be joined into an Explore that has a standard LookML base view.

  • Implicit joins: Features that rely on implicit joins aren't supported for analytic models. Some examples of features that rely on implicit joins are custom calendars and fields that are defined with type: location, type: distance, or type: zipcode.

  • The following features are not supported with analytic models: