撰寫的背景資訊是資料代理擁有者提供的指引,可塑造資料代理的行為,並改善 API 的回應。撰寫得當的背景資訊可為對話式數據分析 API 資料代理程式提供實用背景資訊,協助回答資料來源相關問題。
本頁面說明如何為 BigQuery 資料來源提供撰寫的內容。如果是 BigQuery 資料來源,您可以透過結構化內容和系統指令的組合,提供撰寫的內容。請盡可能透過結構化內容欄位提供背景資訊。接著,您可以使用 system_instruction
參數,提供結構化欄位未涵蓋的補充指引。
結構化情境欄位和系統指令都是選用項目,但提供豐富的情境資訊,有助於服務專員給出更準確且相關的回覆。
定義構成撰寫脈絡的結構化欄位和系統指令後,您可以在下列其中一個呼叫中向 API 提供該脈絡:
- 建立持續性資料代理:在要求主體的
published_context
物件中加入撰寫的內容,設定代理行為,讓代理在多個對話中保持一致。詳情請參閱「建立資料代理程式」(HTTP) 或「為有狀態或無狀態的對話設定環境」(Python SDK)。 - 傳送無狀態要求:在聊天要求中,於
inline_context
物件內提供撰寫的內容,為該特定 API 呼叫定義代理程式的行為。詳情請參閱「建立無狀態多輪對話」(HTTP) 或「傳送含內嵌情境的無狀態即時通訊要求」(Python SDK)。
定義結構化情境欄位
本節說明如何使用結構化內容欄位,為資料代理程式提供內容。您可以向代理程式提供下列資訊,做為結構化脈絡:
- 資料表層級的結構化內容,包括資料表的說明、同義字和標記
- 資料欄層級的結構化內容,包括資料表資料欄的說明、同義字、標記和範例值
- 範例查詢:提供自然語言問題和對應的 SQL 查詢,引導代理程式
資料表層級的結構化背景資訊
使用 tableReferences
鍵向代理提供詳細資料,說明可回答問題的特定資料表。針對每個表格參照,您可以使用下列結構化內容欄位定義表格的結構定義:
description
:資料表內容和用途的摘要synonyms
:可用來指稱資料表的替代字詞清單tags
:與資料表相關聯的關鍵字或標記清單
下列範例說明如何透過直接 HTTP 要求和 Python SDK,以結構化內容的形式提供這些屬性。
HTTP
在直接 HTTP 要求中,您會在相關表格參照的 schema
物件中提供這些表格層級屬性。如需如何建構完整要求酬載的完整範例,請參閱「連結至 BigQuery 資料」。
"tableReferences": [
{
"projectId": "bigquery-public-data",
"datasetId": "thelook_ecommerce",
"tableId": "orders",
"schema": {
"description": "Data for orders in The Look, a fictitious ecommerce store.",
"synonyms": ["sales"],
"tags": ["sale", "order", "sales_order"]
}
},
{
"projectId": "bigquery-public-data",
"datasetId": "thelook_ecommerce",
"tableId": "users",
"schema": {
"description": "Data for users in The Look, a fictitious ecommerce store.",
"synonyms": ["customers"],
"tags": ["user", "customer", "buyer"]
}
}
]
Python SDK
使用 Python SDK 時,您可以在 BigQueryTableReference
物件的 schema
屬性中定義這些資料表層級屬性。以下範例說明如何建立資料表參照物件,為 orders
和 users
資料表提供背景資訊。如需瞭解如何建構及使用資料表參照物件的完整範例,請參閱「連結至 BigQuery 資料」。
# Define context for the 'orders' table
bigquery_table_reference_1 = geminidataanalytics.BigQueryTableReference()
bigquery_table_reference_1.project_id = "bigquery-public-data"
bigquery_table_reference_1.dataset_id = "thelook_ecommerce"
bigquery_table_reference_1.table_id = "orders"
bigquery_table_reference_1.schema = geminidataanalytics.Schema()
bigquery_table_reference_1.schema.description = "Data for orders in The Look, a fictitious ecommerce store."
bigquery_table_reference_1.schema.synonyms = ["sales"]
bigquery_table_reference_1.schema.tags = ["sale", "order", "sales_order"]
# Define context for the 'users' table
bigquery_table_reference_2 = geminidataanalytics.BigQueryTableReference()
bigquery_table_reference_2.project_id = "bigquery-public-data"
bigquery_table_reference_2.dataset_id = "thelook_ecommerce"
bigquery_table_reference_2.table_id = "users"
bigquery_table_reference_2.schema = geminidataanalytics.Schema()
bigquery_table_reference_2.schema.description = "Data for users in The Look, a fictitious ecommerce store."
bigquery_table_reference_2.schema.synonyms = ["customers"]
bigquery_table_reference_2.schema.tags = ["user", "customer", "buyer"]
資料欄層級的結構化脈絡
fields
鍵會巢狀內嵌在表格參照的 schema
物件中,並採用 field
物件清單來描述個別資料欄。並非所有欄位都需要額外背景資訊,但如果提供常用欄位的詳細資料,有助於提升服務專員的成效。
針對每個 field
物件,您可以使用下列結構化內容欄位定義資料欄的基本屬性:
description
:簡要說明資料欄的內容和用途synonyms
:可用來指稱資料欄的替代字詞清單tags
:與資料欄相關聯的關鍵字或標記清單
以下範例說明如何透過直接 HTTP 要求和 Python SDK,為 orders
表格中的 status
欄位和 users
表格中的 first_name
欄位,以結構化內容的形式提供這些屬性。
HTTP
在直接 HTTP 要求中,您可以在表格參照的 schema
物件中提供 fields
物件清單,定義這些資料欄層級的屬性。
"tableReferences": [
{
"projectId": "bigquery-public-data",
"datasetId": "thelook_ecommerce",
"tableId": "orders",
"schema": {
"fields": [{
"name": "status",
"description": "The current status of the order.",
}]
}
},
{
"projectId": "bigquery-public-data",
"datasetId": "thelook_ecommerce",
"tableId": "users",
"schema": {
"fields": [{
"name": "first_name",
"description": "The first name of the user.",
"tags": "person",
}]
}
}
]
Python SDK
使用 Python SDK 時,您可以將 Field
物件清單指派給資料表 schema
屬性的 fields
屬性,定義這些資料欄層級屬性。
# Define column context for the 'orders' table
bigquery_table_reference_1.schema.fields = [
geminidataanalytics.Field(
name="status",
description="The current status of the order.",
)
]
# Define column context for the 'users' table
bigquery_table_reference_2.schema.fields = [
geminidataanalytics.Field(
name="first_name",
description="The first name of the user.",
tags=["person"],
)
]
查詢範例
example_queries
鍵會採用 example_query
物件清單,定義自然語言查詢,協助代理程式針對常見或重要問題提供更準確且相關的回覆。提供自然語言問題和對應的 SQL 查詢給代理程式,有助於引導代理程式提供品質更高、更一致的結果。
您可以為每個 example_query
物件提供下列欄位,定義自然語言問題和對應的 SQL 查詢:
natural_language_question
:使用者可能會提出的自然語言問題sql_query
:對應自然語言問題的 SQL 查詢
下列範例說明如何透過直接 HTTP 要求和 Python SDK,為 orders
資料表提供查詢範例。
HTTP
直接發出 HTTP 要求時,請在 example_queries
欄位中提供 example_query
物件清單。每個物件都必須包含 naturalLanguageQuestion
鍵和對應的 sqlQuery
鍵。
"example_queries": [
{
"naturalLanguageQuestion": "How many orders are there?",
"sqlQuery": "SELECT COUNT(*) FROM `bigquery-public-data.thelook_ecommerce.orders`"
},
{
"naturalLanguageQuestion": "How many orders were shipped?",
"sqlQuery": "SELECT COUNT(*) FROM `bigquery-public-data.thelook_ecommerce.orders` WHERE status = 'shipped'"
}
]
Python SDK
使用 Python SDK 時,您可以提供 ExampleQuery
物件清單。為每個物件提供 natural_language_question
和 sql_query
參數的值。
example_queries = [
geminidataanalytics.ExampleQuery(
natural_language_question="How many orders are there?",
sql_query="SELECT COUNT(*) FROM `bigquery-public-data.thelook_ecommerce.orders`",
),
geminidataanalytics.ExampleQuery(
natural_language_question="How many orders were shipped?",
sql_query="SELECT COUNT(*) FROM `bigquery-public-data.thelook_ecommerce.orders` WHERE status = 'shipped'",
)
]
在系統指令中定義其他脈絡
您可以使用 system_instruction
參數,為結構化內容欄位不支援的內容提供補充指引。提供這類額外指引,有助於專員進一步瞭解資料和用途的脈絡。
系統指令包含一系列重要元件和物件,可向資料代理程式提供資料來源的詳細資料,以及代理程式在回答問題時的角色指引。您可以在 system_instruction
參數中以 YAML 格式的字串,向資料代理程式提供系統指令。
下列範本顯示字串的建議 YAML 結構,您可以提供給 BigQuery 資料來源的 system_instruction
參數,包括可用的鍵和預期的資料類型。這個範本提供建議的結構,以及定義系統指令的重要元件,但不包含所有可能的系統指令格式。
- system_instruction: str # A description of the expected behavior of the agent. For example: You are a sales agent.
- tables: # A list of tables to describe for the agent.
- table: # Details about a single table that is relevant for the agent.
- name: str # The name of the table.
- fields: # Details about columns (fields) within the table.
- field: # Details about a single column within the current table.
- name: str # The name of the column.
- aggregations: list[str] # Commonly used or default aggregations for the column.
- relationships: # A list of join relationships between tables.
- relationship: # Details about a single join relationship.
- name: str # The name of this join relationship.
- description: str # A description of the relationship.
- relationship_type: str # The join relationship type: one-to-one, one-to-many, many-to-one, or many-to-many.
- join_type: str # The join type: inner, outer, left, right, or full.
- left_table: str # The name of the left table in the join.
- right_table: str # The name of the right table in the join.
- relationship_columns: # A list of columns that are used for the join.
- left_column: str # The join column from the left table.
- right_column: str # The join column from the right table.
- glossaries: # A list of definitions for glossary business terms, jargon, and abbreviations.
- glossary: # The definition for a single glossary item.
- term: str # The term, phrase, or abbreviation to define.
- description: str # A description or definition of the term.
- synonyms: list[str] # Alternative terms for the glossary entry.
- additional_descriptions: # A list of any other general instructions or content.
- text: str # Any additional general instructions or context not covered elsewhere.
以下各節提供系統指令主要元件的範例:
system_instruction
使用 system_instruction
鍵定義代理程式的角色和員工角色。這項初始指令會為 API 回覆設定語氣和風格,並協助代理程式瞭解核心用途。
舉例來說,您可以為虛構的電子商務商店定義代理程式,如下所示:
- system_instruction: >-
You are an expert sales analyst for a fictitious ecommerce store. You will answer questions about sales, orders, and customer data. Your responses should be concise and data-driven.
tables
定義資料表的基本屬性 (例如說明和同義字) 時,您可以使用結構化內容,也可以在系統指令中使用 tables
鍵提供補充業務邏輯。如果是 BigQuery 資料來源,則包括使用 fields
鍵,為特定資料欄定義預設 aggregations
。
下列 YAML 程式碼區塊範例說明如何在系統指令中使用 tables
鍵,巢狀結構化提供表格 bigquery-public-data.thelook_ecommerce.orders
補充指引的欄位:
- tables:
- table:
- name: bigquery-public-data.thelook_ecommerce.orders
- fields:
- field:
- name: num_of_items
- aggregations: 'sum, avg'
relationships
系統指令中的 relationships
鍵包含資料表之間的彙整關係清單。定義聯結關係後,代理程式就能瞭解如何在回答問題時聯結多個資料表的資料。
舉例來說,您可以定義 bigquery-public-data.thelook_ecommerce.orders
資料表和 bigquery-public-data.thelook_ecommerce.users
資料表之間的 orders_to_user
關係,如下所示:
- relationships:
- relationship:
- name: orders_to_user
- description: >-
Connects customer order data to user information with the user_id and id fields to allow an aggregated view of sales by customer demographics.
- relationship_type: many-to-one
- join_type: left
- left_table: bigquery-public-data.thelook_ecommerce.orders
- right_table: bigquery-public-data.thelook_ecommerce.users
- relationship_columns:
- left_column: user_id
- right_column: id
glossaries
系統指令中的 glossaries
鍵會列出與資料和用途相關的業務用語、術語和縮寫定義。提供字彙表定義,有助於服務專員準確解讀並回答使用特定業務用語的問題。
舉例來說,您可以根據特定業務情境,定義常見業務狀態和「OMPF」等字詞,如下所示:
- glossaries:
- glossary:
- term: complete
- description: Represents an order status where the order has been completed.
- synonyms: 'finish, done, fulfilled'
- glossary:
- term: shipped
- description: Represents an order status where the order has been shipped to the customer.
- glossary:
- term: returned
- description: Represents an order status where the customer has returned the order.
- glossary:
- term: OMPF
- description: Order Management and Product Fulfillment
additional_descriptions
使用 additional_descriptions
鍵提供任何不適合其他結構化脈絡或系統指令欄位的一般指示或脈絡。在系統指令中提供額外說明,有助於專員進一步瞭解資料和用途的脈絡。
舉例來說,您可以使用 additional_descriptions
鍵提供機構組織的相關資訊,如下所示:
- additional_descriptions:
- text: All the sales data pertains to The Look, a fictitious ecommerce store.
- text: 'Orders can be of three categories: food, clothes, and electronics.'
示例:銷售專員的撰寫背景資訊
以下是虛構的銷售分析師代理程式範例,說明如何結合結構化背景資訊和系統指令,提供撰寫的背景資訊。
範例:結構化背景資訊
您可以提供結構化內容,詳細說明資料表、資料欄和查詢範例,引導代理程式,如下列 HTTP 和 Python SDK 範例所示。
HTTP
以下範例說明如何在 HTTP 要求中定義結構化內容:
{
"bq": {
"tableReferences": [
{
"projectId": "bigquery-public-data",
"datasetId": "thelook_ecommerce",
"tableId": "orders",
"schema": {
"description": "Data for orders in The Look, a fictitious ecommerce store.",
"synonyms": ["sales"],
"tags": ["sale", "order", "sales_order"],
"fields": [
{
"name": "status",
"description": "The current status of the order."
},
{
"name": "num_of_items",
"description": "The number of items in the order."
}
]
}
},
{
"projectId": "bigquery-public-data",
"datasetId": "thelook_ecommerce",
"tableId": "users",
"schema": {
"description": "Data for users in The Look, a fictitious ecommerce store.",
"synonyms": ["customers"],
"tags": ["user", "customer", "buyer"],
"fields": [
{
"name": "first_name",
"description": "The first name of the user.",
"tags": ["person"]
},
{
"name": "last_name",
"description": "The last name of the user.",
"tags": ["person"]
},
{
"name": "age_group",
"description": "The age demographic group of the user."
},
{
"name": "email",
"description": "The email address of the user.",
"tags": ["contact"]
}
]
}
}
]
},
"example_queries": [
{
"naturalLanguageQuestion": "How many orders are there?",
"sqlQuery": "SELECT COUNT(*) FROM `bigquery-public-data.thelook_ecommerce.orders`"
},
{
"naturalLanguageQuestion": "How many orders were shipped?",
"sqlQuery": "SELECT COUNT(*) FROM `bigquery-public-data.thelook_ecommerce.orders` WHERE status = 'shipped'"
},
{
"naturalLanguageQuestion": "How many unique customers are there?",
"sqlQuery": "SELECT COUNT(DISTINCT id) FROM `bigquery-public-data.thelook_ecommerce.users`"
},
{
"naturalLanguageQuestion": "How many users in the 25-34 age group have a cymbalgroup email address?",
"sqlQuery": "SELECT COUNT(DISTINCT id) FROM `bigquery-public-data.thelook_ecommerce.users` WHERE users.age_group = '25-34' AND users.email LIKE '%@cymbalgroup.com'"
}
]
}
Python SDK
以下範例說明如何使用 Python SDK 定義結構化環境:
# Define context for the 'orders' table
bigquery_table_reference_1 = geminidataanalytics.BigQueryTableReference()
bigquery_table_reference_1.project_id = "bigquery-public-data"
bigquery_table_reference_1.dataset_id = "thelook_ecommerce"
bigquery_table_reference_1.table_id = "orders"
bigquery_table_reference_1.schema = geminidataanalytics.Schema()
bigquery_table_reference_1.schema.description = "Data for orders in The Look, a fictitious ecommerce store."
bigquery_table_reference_1.schema.synonyms = ["sales"]
bigquery_table_reference_1.schema.tags = ["sale", "order", "sales_order"]
bigquery_table_reference_1.schema.fields = [
geminidataanalytics.Field(
name="status",
description="The current status of the order.",
),
geminidataanalytics.Field(
name="num_of_items",
description="The number of items in the order."
)
]
# Define context for the 'users' table
bigquery_table_reference_2 = geminidataanalytics.BigQueryTableReference()
bigquery_table_reference_2.project_id = "bigquery-public-data"
bigquery_table_reference_2.dataset_id = "thelook_ecommerce"
bigquery_table_reference_2.table_id = "users"
bigquery_table_reference_2.schema = geminidataanalytics.Schema()
bigquery_table_reference_2.schema.description = "Data for users in The Look, a fictitious ecommerce store."
bigquery_table_reference_2.schema.synonyms = ["customers"]
bigquery_table_reference_2.schema.tags = ["user", "customer", "buyer"]
bigquery_table_reference_2.schema.fields = [
geminidataanalytics.Field(
name="first_name",
description="The first name of the user.",
tags=["person"],
),
geminidataanalytics.Field(
name="last_name",
description="The last name of the user.",
tags=["person"],
),
geminidataanalytics.Field(
name="age_group",
description="The age demographic group of the user.",
),
geminidataanalytics.Field(
name="email",
description="The email address of the user.",
tags=["contact"],
)
]
# Define example queries
example_queries = [
geminidataanalytics.ExampleQuery(
natural_language_question="How many orders are there?",
sql_query="SELECT COUNT(*) FROM `bigquery-public-data.thelook_ecommerce.orders`",
),
geminidataanalytics.ExampleQuery(
natural_language_question="How many orders were shipped?",
sql_query="SELECT COUNT(*) FROM `bigquery-public-data.thelook_ecommerce.orders` WHERE status = 'shipped'",
),
geminidataanalytics.ExampleQuery(
natural_language_question="How many unique customers are there?",
sql_query="SELECT COUNT(DISTINCT id) FROM `bigquery-public-data.thelook_ecommerce.users`",
),
geminidataanalytics.ExampleQuery(
natural_language_question="How many users in the 25-34 age group have a cymbalgroup email address?",
sql_query="SELECT COUNT(DISTINCT id) FROM `bigquery-public-data.thelook_ecommerce.users` WHERE users.age_group = '25-34' AND users.email LIKE '%@cymbalgroup.com'",
)
]
範例:系統指示
下列系統指令會定義代理程式的角色,並提供結構化欄位不支援的指引,例如關係定義、詞彙表字詞、額外說明和補充orders
表格詳細資料,藉此補充結構化背景資訊。在本範例中,由於 users
表格已透過結構化內容完整定義,因此不需要在系統指令中重新定義。
- system_instruction: >-
You are an expert sales analyst for a fictitious ecommerce store. You will answer questions about sales, orders, and customer data. Your responses should be concise and data-driven.
- tables:
- table:
- name: bigquery-public-data.thelook_ecommerce.orders
- fields:
- field:
- name: num_of_items
- aggregations: 'sum, avg'
- relationships:
- relationship:
- name: orders_to_user
- description: >-
Connects customer order data to user information with the user_id and id fields.
- relationship_type: many-to-one
- join_type: left
- left_table: bigquery-public-data.thelook_ecommerce.orders
- right_table: bigquery-public-data.thelook_ecommerce.users
- relationship_columns:
- left_column: user_id
- right_column: id
- glossaries:
- glossary:
- term: complete
- description: Represents an order status where the order has been completed.
- synonyms: 'finish, done, fulfilled'
- glossary:
- term: OMPF
- description: Order Management and Product Fulfillment
- additional_descriptions:
- text: All the sales data pertains to The Look, a fictitious ecommerce store.