为 Looker 数据源定义数据代理上下文

本页介绍了如何为使用 Looker 数据源(基于 Looker Explore)的数据代理编写系统指令。

编写的上下文是数据代理所有者可以提供的指导,用于引导数据代理的行为并优化 API 的回答。有效的编写上下文可为 Conversational Analytics API 数据代理提供有用的上下文,以便回答有关数据源的问题。

对于 Looker 数据源,您可以通过结构化上下文系统指令的组合来提供编写的上下文。尽可能通过结构化上下文字段提供上下文。然后,您可以使用 system_instruction 参数来提供结构化字段未涵盖的补充指导。系统指令是一种由数据代理所有者提供的编写上下文,可用于告知代理其角色、语气和总体行为。通常,系统指令比结构化上下文更自由。

虽然结构化上下文字段和系统指令都是可选的,但提供丰富的上下文可让代理给出更准确且相关的回答。在创建数据智能体的过程中,您提供的任何结构化上下文信息都会自动添加到系统指令中。

定义结构化上下文

您可以为数据代理提供结构化上下文中的标准问题和答案。定义结构化上下文后,您可以使用直接 HTTP 请求或通过 Python SDK 将其提供给数据代理。

对于 Looker 数据源,黄金查询会捕获在 looker_golden_queries 键中,该键用于定义自然语言问题及其对应的 Looker 查询。通过为智能体提供一对自然语言问题及其对应的探索元数据,您可以引导智能体提供更高质量且更一致的结果。本页包含 Looker 黄金查询的示例

如需定义每个 Looker 黄金查询,请为以下两个字段提供值:

  • natural_language_questions:用户可能会提出的自然语言问题
  • looker_query:与自然语言问题对应的 Looker 黄金查询

以下是探索“机场”中的 natural_language_questions-looker_query 对示例:

  natural_language_questions: ["What are the major airport codes and cities in CA?"]
  looker_query": {
        "model": "airports",
        "explore": "airports",
        "fields": ["airports.city", "airports.code"],
        "filters": [
          {
            "field": "airports.major",
            "value": "Y"
          },
          {
            "field": "airports.state",
            "value": "CA"
          }
        ]
  }

定义 Looker 黄金查询

通过为 natural_language_questionslooker_query 字段提供值,为给定的探索定义 Looker 黄金查询。对于 natural_language_questions 字段,请考虑用户可能会针对相应探索提出的问题,并以自然语言写出这些问题。您可以在此字段的值中添加多个问题。您可以从探索的查询元数据中获取 looker_query 字段的值。

Looker 查询对象支持以下字段:

  • model(字符串):用于生成查询的 LookML 模型。此字段是必填字段。
  • explore(字符串):用于生成查询的 Explore。此字段为必填字段。
  • fields[](字符串):要从探索中检索的字段,包括维度和指标。这是一个可选字段。
  • filters[](对象 [Filter]):要应用于探索的过滤条件。这是一个可选字段。
  • sorts[](字符串):要应用于“探索”的排序。此字段为可选字段。
  • limit(字符串):要应用于探索的数据行数上限。此字段为选填字段。

您可以通过以下方式检索探索的查询元数据:

从“探索”界面检索查询元数据

  1. 在“探索”中,选择探索操作菜单,然后选择获取 LookML
  2. 选择信息中心标签页。
  3. 从 LookML 中复制查询详细信息。例如,下图显示了名为“Order Items”的探索的 LookML:

复制所选元数据,以便在 Looker 黄金查询中使用:

  model: thelook
  explore: order_items
  fields: [order_items.order_id, orders.status]
  sorts: [orders.status, order_items.order_id]
  limit: 500

使用 Looker API 检索 Looker 查询对象

如需使用 Looker API 检索有关探索的信息,请按以下步骤操作:

  1. 在“探索”中,选择探索操作菜单,然后选择分享。Looker 会显示您可以复制的网址,以便分享探索。共享网址通常类似于 https://looker.yourcompany/x/vwGSbfc。分享网址末尾的 vwGSbfc 是分享标识符。
  2. 复制分享标识。
  3. 向 Looker API 发出请求:在 Explore_slug 中以字符串形式传递探索网址 slug GET /queries/slug/Explore_slug。在请求中,包含您希望返回的探索查询元数据中的字段。如需了解详情,请参阅 Get Query for Slug API 参考文档页面。
  4. 从 API 响应中复制查询元数据。

Looker 黄金查询示例

以下示例展示了如何通过直接 HTTP 请求和 Python SDK 为 airports 探索功能提供标准查询。

HTTP

在直接 HTTP 请求中,为 looker_golden_queries 键提供 Looker 黄金查询对象列表。每个对象都必须包含一个 natural_Language_questions 键和一个对应的 looker_query 键。

looker_golden_queries = [
  {
    "natural_language_questions": ["What is the highest observed positive longitude?"],
    "looker_query": {
      "model": "airports",
      "explore": "airports",
      "fields": ["airports.longitude"],
      "filters": [
        {
          "field": "airports.longitude",
          "value": ">0"
        }
      ],
      "sorts": ["airports.longitude desc"],
      "limit": "1"
    }
  },
 {
    "natural_language_questions": ["What are the major airport codes and cities in CA?", "Can you list the cities and airport codes of airports in CA?"],
    "looker_query": {
      "model": "airports",
      "explore": "airports",
      "fields": ["airports.city", "airports.code"],
      "filters": [
        {
          "field": "airports.major",
          "value": "Y"
        },
        {
          "field": "airports.state",
          "value": "CA"
        }
      ]
    }
  },
]

Python SDK

使用 Python SDK 时,您可以提供 LookerGoldenQuery 对象的列表。为每个对象提供 natural_language_questionslooker_query 参数的值。

looker_golden_queries = [geminidataanalytics.LookerGoldenQuery(
      natural_language_questions=[
          "What is the highest observed positive longitude?"
      ],
      looker_query=geminidataanalytics.LookerQuery(
          model="airports",
          explore="airports",
          fields=["airports.longitude"],
          filters=[
              geminidataanalytics.LookerQuery.Filter(
                  field="airports.longitude", value=">0"
              )
          ],
          sorts=["airports.longitude desc"],
          limit="1",
      ),
  ),
  geminidataanalytics.LookerGoldenQuery(
      natural_language_questions=[
          "What are the major airport codes and cities in CA?",
          "Can you list the cities and airport codes of airports in CA?",
      ],
      looker_query=geminidataanalytics.LookerQuery(
          model="airports",
          explore="airports",
          fields=["airports.city", "airports.code"],
          filters=[
              geminidataanalytics.LookerQuery.Filter(
                  field="airports.major", value="Y"
              ),
              geminidataanalytics.LookerQuery.Filter(
                  field="airports.state", value="CA"
              ),
          ],
      ),
  ),
]

在系统指令中定义其他上下文

系统指令包含一系列关键组件和对象,可为数据代理提供有关数据源的详细信息,以及有关代理在回答问题时所扮演角色的指导。您可以使用 system_instruction 参数以 YAML 格式的字符串形式向数据代理提供系统指令。

以下 YAML 模板展示了如何为 Looker 数据源构建系统指令:

-   system_instruction: str # Describe the expected behavior of the agent
-   glossaries: # Define business terms, jargon, and abbreviations that are relevant to your use case
    -   glossary:
            -   term: str
            -   description: str
            -   synonyms: list[str]
-   additional_descriptions: # List any additional general instructions
    -   text: str

系统指令的关键组成部分说明

以下部分包含 Looker 中系统指令的关键组成部分示例。这些键包括:

system_instruction

使用 system_instruction 键定义智能体的角色及角色设定。此初始指令可为 API 的回答设定基调和风格,并帮助智能体理解其核心目标。

例如,您可以将智能体定义成一个虚构网店的销售分析师,如下所示:

-   system_instruction: You are an expert sales analyst for a fictitious
    ecommerce store. You will answer questions about sales, orders, and customer
    data. Your responses should be concise and data-driven.

glossaries

glossaries 键列出了与您的数据及应用场景相关但尚未出现在您数据中的业务术语、行话和缩写的定义。例如,您可以根据特定业务情境定义常见业务状态和“忠实客户”等字词,如下所示:

-   glossaries:
    -   glossary:
            -   term: Loyal Customer
            -   description: A customer who has made more than one purchase.
                Maps to the dimension 'user_order_facts.repeat_customer' being
                'Yes'. High value loyal customers are those with high
                'user_order_facts.lifetime_revenue'.
            -   synonyms:
                -   repeat customer
                -   returning customer

additional_descriptions

additional_descriptions 键列出了系统指令中未涵盖的任何其他一般性说明或上下文信息。例如,您可以使用 additional_descriptions 键提供有关智能体的信息,如下所示:

-   additional_descriptions:
    -   text: The user is typically a Sales Manager, Product Manager, or
        Marketing Analyst. They need to understand performance trends, build
        customer lists for campaigns, and analyze product sales.

示例:Looker 中的系统指令

以下示例展示了虚构的销售分析师智能体的示例系统指令:

-   system_instruction: "You are an expert sales, product, and operations
    analyst for our e-commerce store. Your primary function is to answer
    questions by querying the 'Order Items' Explore. Always be concise and
    data-driven. When asked about 'revenue' or 'sales', use
    'order_items.total_sale_price'. For 'profit' or 'margin', use
    'order_items.total_gross_margin'. For 'customers' or 'users', use
    'users.count'. The default date for analysis is 'order_items.created_date'
    unless specified otherwise. For advanced statistical questions, such as
    correlation or regression analysis, use the Python tool to fetch the
    necessary data, perform the calculation, and generate a plot (like a scatter
    plot or heatmap)."
-   glossaries:
    -   term: Revenue
    -   description: The total monetary value from items sold. Maps to the
        measure 'order_items.total_sale_price'.
    -   synonyms:
        -   sales
        -   total sales
        -   income
        -   turnover
    -   term: Profit
    -   description: Revenue minus the cost of goods sold. Maps to the measure
        'order_items.total_gross_margin'.
    -   synonyms:
        -   margin
        -   gross margin
        -   contribution
    -   term: Buying Propensity
    -   description: Measures the likelihood of a customer to purchase again
        soon. Primarily maps to the 'order_items.30_day_repeat_purchase_rate'
        measure.
    -   synonyms:
        -   repeat purchase rate
        -   repurchase likelihood
        -   customer velocity
    -   term: Customer Lifetime Value
    -   description: The total revenue a customer has generated over their
        entire history with us. Maps to 'user_order_facts.lifetime_revenue'.
    -   synonyms:
        -   CLV
        -   LTV
        -   lifetime spend
        -   lifetime value
    -   term: Loyal Customer
    -   description: "A customer who has made more than one purchase. Maps to
        the dimension 'user_order_facts.repeat_customer' being 'Yes'. High value
        loyal customers are those with high
        'user_order_facts.lifetime_revenue'."
    -   synonyms:
        -   repeat customer
        -   returning customer
    -   term: Active Customer
    -   description: "A customer who is currently considered active based on
        their recent purchase history. Mapped to
        'user_order_facts.currently_active_customer' being 'Yes'."
    -   synonyms:
        -   current customer
        -   engaged shopper
    -   term: Audience
    -   description: A list of customers, typically identified by their email
        address, for marketing or analysis purposes.
    -   synonyms:
        -   audience list
        -   customer list
        -   segment
    -   term: Return Rate
    -   description: The percentage of items that are returned by customers
        after purchase. Mapped to 'order_items.return_rate'.
    -   synonyms:
        -   returns percentage
        -   RMA rate
    -   term: Processing Time
    -   description: The time it takes to prepare an order for shipment from the
        moment it is created. Maps to 'order_items.average_days_to_process'.
    -   synonyms:
        -   fulfillment time
        -   handling time
    -   term: Inventory Turn
    -   description: "A concept related to how quickly stock is sold. This can
        be analyzed using 'inventory_items.days_in_inventory' (lower days means
        higher turn)."
    -   synonyms:
        -   stock turn
        -   inventory turnover
        -   sell-through
    -   term: New vs Returning Customer
    -   description: "A classification of whether a purchase was a customer's
        first ('order_facts.is_first_purchase' is Yes) or if they are a repeat
        buyer ('user_order_facts.repeat_customer' is Yes)."
    -   synonyms:
        -   customer type
        -   first-time buyer
-   additional_descriptions:
    -   text: The user is typically a Sales Manager, Product Manager, or
        Marketing Analyst. They need to understand performance trends, build
        customer lists for campaigns, and analyze product sales.
    -   text: This agent can answer complex questions by joining data about
        sales line items, products, users, inventory, and distribution centers.

后续步骤

定义构成编写的上下文的结构化字段和系统指令后,您可以通过以下任一调用将该上下文提供给 Conversational Analytics API: