Questa pagina è stata tradotta dall'API Cloud Translation.

Definisci il contesto dell'agente dati per le origini dati BigQuery

Il contesto creato è una guida che i proprietari degli agenti di dati possono fornire per modellare il comportamento di un agente di dati e perfezionare le risposte dell'API. Un contesto creato in modo efficace fornisce agli agenti di dati dell'API Conversational Analytics un contesto utile per rispondere alle domande sulle origini dati.

Questa pagina descrive come fornire un contesto creato per le origini dati BigQuery. Per le origini dati BigQuery, puoi fornire il contesto creato tramite una combinazione di contesto strutturato e istruzioni di sistema. Se possibile, fornisci il contesto tramite i campi di contesto strutturato. Puoi quindi utilizzare il parametro system_instruction per indicazioni supplementari non coperte dai campi strutturati.

Sebbene sia i campi di contesto strutturato sia le istruzioni di sistema siano facoltativi, fornire un contesto solido consente all'agente di fornire risposte più accurate e pertinenti.

Dopo aver definito i campi strutturati e le istruzioni di sistema che compongono il contesto creato, puoi fornire questo contesto all'API in una delle seguenti chiamate:

Creazione di un agente dati persistente: includi il contesto creato all'interno dell'oggetto published_context nel corpo della richiesta per configurare il comportamento dell'agente che persiste in più conversazioni. Per saperne di più, vedi Creare un agente di dati (HTTP) o Configurare il contesto per la chat con stato o stateless (SDK Python).
Invio di una richiesta stateless: fornisci il contesto creato all'interno dell'oggetto inline_context in una richiesta di chat per definire il comportamento dell'agente per quella specifica chiamata API. Per saperne di più, consulta Creare una conversazione stateless multi-turn (HTTP) o Inviare una richiesta di chat stateless con contesto incorporato (SDK Python).

Definisci i campi di contesto strutturati

Questa sezione descrive come fornire il contesto a un agente dati utilizzando i campi di contesto strutturati. Puoi fornire le seguenti informazioni a un agente come contesto strutturato:

Contesto strutturato a livello di tabella, inclusi descrizione, sinonimi e tag per una tabella
Contesto strutturato a livello di colonna, inclusi descrizione, sinonimi, tag e valori di esempio per le colonne di una tabella
Query di esempio, che prevedono la fornitura di domande in linguaggio naturale e delle query SQL corrispondenti per guidare l'agente

Contesto strutturato a livello di tabella

Utilizza il tasto tableReferences per fornire a un agente i dettagli sulle tabelle specifiche disponibili per rispondere alle domande. Per ogni riferimento alla tabella, puoi utilizzare i seguenti campi di contesto strutturato per definire lo schema di una tabella:

description: Un riepilogo dei contenuti e dello scopo della tabella
synonyms: Un elenco di termini alternativi che possono essere utilizzati per fare riferimento alla tabella
tags: Un elenco di parole chiave o tag associati alla tabella

Gli esempi seguenti mostrano come fornire queste proprietà come contesto strutturato all'interno di richieste HTTP dirette e con l'SDK Python.

HTTP

In una richiesta HTTP diretta, fornisci queste proprietà a livello di tabella all'interno dell'oggetto schema per il riferimento alla tabella pertinente. Per un esempio completo di come strutturare il payload della richiesta completo, consulta Connettersi ai dati BigQuery.

"tableReferences": [
  {
    "projectId": "bigquery-public-data",
    "datasetId": "thelook_ecommerce",
    "tableId": "orders",
    "schema": {
        "description": "Data for orders in The Look, a fictitious ecommerce store.",
        "synonyms": ["sales"],
        "tags": ["sale", "order", "sales_order"]
    }
  },
  {
    "projectId": "bigquery-public-data",
    "datasetId": "thelook_ecommerce",
    "tableId": "users",
    "schema": {
        "description": "Data for users in The Look, a fictitious ecommerce store.",
        "synonyms": ["customers"],
        "tags": ["user", "customer", "buyer"]
    }
  }
]

SDK Python

Quando utilizzi l'SDK Python, puoi definire queste proprietà a livello di tabella nella proprietà schema di un oggetto BigQueryTableReference. L'esempio seguente mostra come creare oggetti di riferimento della tabella che forniscono il contesto per le tabelle orders e users. Per un esempio completo di come creare e utilizzare gli oggetti di riferimento della tabella, consulta Connettersi ai dati BigQuery.

# Define context for the 'orders' table
bigquery_table_reference_1 = geminidataanalytics.BigQueryTableReference()
bigquery_table_reference_1.project_id = "bigquery-public-data"
bigquery_table_reference_1.dataset_id = "thelook_ecommerce"
bigquery_table_reference_1.table_id = "orders"

bigquery_table_reference_1.schema = geminidataanalytics.Schema()
bigquery_table_reference_1.schema.description = "Data for orders in The Look, a fictitious ecommerce store."
bigquery_table_reference_1.schema.synonyms = ["sales"]
bigquery_table_reference_1.schema.tags = ["sale", "order", "sales_order"]

# Define context for the 'users' table
bigquery_table_reference_2 = geminidataanalytics.BigQueryTableReference()
bigquery_table_reference_2.project_id = "bigquery-public-data"
bigquery_table_reference_2.dataset_id = "thelook_ecommerce"
bigquery_table_reference_2.table_id = "users"

bigquery_table_reference_2.schema = geminidataanalytics.Schema()
bigquery_table_reference_2.schema.description = "Data for users in The Look, a fictitious ecommerce store."
bigquery_table_reference_2.schema.synonyms = ["customers"]
bigquery_table_reference_2.schema.tags = ["user", "customer", "buyer"]

Contesto strutturato a livello di colonna

La chiave fields, nidificata all'interno dell'oggetto schema di un riferimento alla tabella, accetta un elenco di oggetti field per descrivere le singole colonne. Non tutti i campi richiedono un contesto aggiuntivo, ma per i campi di uso comune, l'inclusione di dettagli aggiuntivi può contribuire a migliorare il rendimento dell'agente.

Per ogni oggetto field, puoi utilizzare i seguenti campi di contesto strutturato per definire le proprietà fondamentali di una colonna:

description: una breve descrizione dei contenuti e dello scopo della colonna
synonyms: un elenco di termini alternativi che possono essere utilizzati per fare riferimento alla colonna
tags: Un elenco di parole chiave o tag associati alla colonna

Gli esempi seguenti mostrano come fornire queste proprietà come contesto strutturato per il campo status all'interno della tabella orders e per il campo first_name all'interno della tabella users con richieste HTTP dirette e con l'SDK Python.

HTTP

In una richiesta HTTP diretta, puoi definire queste proprietà a livello di colonna fornendo un elenco di oggetti fields all'interno dell'oggetto schema di un riferimento di tabella.

"tableReferences": [
  {
    "projectId": "bigquery-public-data",
    "datasetId": "thelook_ecommerce",
    "tableId": "orders",
    "schema": {
      "fields": [{
          "name": "status",
          "description": "The current status of the order.",
      }]
    }
  },
  {
    "projectId": "bigquery-public-data",
    "datasetId": "thelook_ecommerce",
    "tableId": "users",
    "schema": {
      "fields": [{
          "name": "first_name",
          "description": "The first name of the user.",
          "tags": "person",
      }]
    }
  }
]

SDK Python

Quando utilizzi l'SDK Python, puoi definire queste proprietà a livello di colonna assegnando un elenco di oggetti Field alla proprietà fields di una proprietà schema di una tabella.

# Define column context for the 'orders' table
bigquery_table_reference_1.schema.fields = [
    geminidataanalytics.Field(
        name="status",
        description="The current status of the order.",
    )
]

# Define column context for the 'users' table
bigquery_table_reference_2.schema.fields = [
    geminidataanalytics.Field(
        name="first_name",
        description="The first name of the user.",
        tags=["person"],
    )
]

Esempi di query

La chiave example_queries accetta un elenco di oggetti example_query per definire query in linguaggio naturale che aiutano l'agente a fornire risposte più accurate e pertinenti a domande comuni o importanti. Fornendo all'agente sia una domanda in linguaggio naturale sia la query SQL corrispondente, puoi guidarlo a fornire risultati di qualità superiore e più coerenti.

Per ogni oggetto example_query, puoi fornire i seguenti campi per definire una domanda in linguaggio naturale e la relativa query SQL:

natural_language_question: La domanda in linguaggio naturale che un utente potrebbe porre
sql_query: La query SQL che corrisponde alla domanda in linguaggio naturale

Gli esempi seguenti mostrano come fornire query di esempio per la tabella orders sia con richieste HTTP dirette sia con l'SDK Python.

HTTP

In una richiesta HTTP diretta, fornisci un elenco di oggetti example_query nel campo example_queries. Ogni oggetto deve contenere una chiave naturalLanguageQuestion e una chiave sqlQuery corrispondente.

"example_queries": [
  {
    "naturalLanguageQuestion": "How many orders are there?",
    "sqlQuery": "SELECT COUNT(*) FROM `bigquery-public-data.thelook_ecommerce.orders`"
  },
  {
    "naturalLanguageQuestion": "How many orders were shipped?",
    "sqlQuery": "SELECT COUNT(*) FROM `bigquery-public-data.thelook_ecommerce.orders` WHERE status = 'shipped'"
  }
]

SDK Python

Quando utilizzi l'SDK Python, puoi fornire un elenco di oggetti ExampleQuery. Per ogni oggetto, fornisci i valori per i parametri natural_language_question e sql_query.

example_queries = [
    geminidataanalytics.ExampleQuery(
        natural_language_question="How many orders are there?",
        sql_query="SELECT COUNT(*) FROM `bigquery-public-data.thelook_ecommerce.orders`",
    ),
    geminidataanalytics.ExampleQuery(
        natural_language_question="How many orders were shipped?",
        sql_query="SELECT COUNT(*) FROM `bigquery-public-data.thelook_ecommerce.orders` WHERE status = 'shipped'",
    )
]

Definisci un contesto aggiuntivo nelle istruzioni di sistema

Puoi utilizzare il parametro system_instruction per fornire indicazioni supplementari per il contesto non supportato dai campi di contesto strutturato. Fornendo queste indicazioni aggiuntive, puoi aiutare l'agente a comprendere meglio il contesto dei tuoi dati e del tuo caso d'uso.

Le istruzioni di sistema sono costituite da una serie di componenti e oggetti chiave che forniscono all'agente dati dettagli sull'origine dati e indicazioni sul ruolo dell'agente quando risponde alle domande. Puoi fornire istruzioni di sistema all'agente dati nel parametro system_instruction come stringa formattata in YAML.

Il seguente modello mostra una struttura YAML suggerita per la stringa, che puoi fornire al parametro system_instruction per un'origine dati BigQuery, incluse le chiavi disponibili e i tipi di dati previsti. Sebbene questo modello fornisca una struttura suggerita con componenti importanti per definire le istruzioni di sistema, non include tutti i possibili formati di istruzioni di sistema.

- system_instruction: str # A description of the expected behavior of the agent. For example: You are a sales agent.
- tables: # A list of tables to describe for the agent.
  - table: # Details about a single table that is relevant for the agent.
    - name: str # The name of the table.
    - fields: # Details about columns (fields) within the table.
      - field: # Details about a single column within the current table.
        - name: str # The name of the column.
        - aggregations: list[str] # Commonly used or default aggregations for the column.
  - relationships: # A list of join relationships between tables.
    - relationship: # Details about a single join relationship.
      - name: str # The name of this join relationship.
      - description: str # A description of the relationship.
      - relationship_type: str # The join relationship type: one-to-one, one-to-many, many-to-one, or many-to-many.
      - join_type: str # The join type: inner, outer, left, right, or full.
      - left_table: str # The name of the left table in the join.
      - right_table: str # The name of the right table in the join.
      - relationship_columns: # A list of columns that are used for the join.
        - left_column: str # The join column from the left table.
        - right_column: str # The join column from the right table.
- glossaries: # A list of definitions for glossary business terms, jargon, and abbreviations.
  - glossary: # The definition for a single glossary item.
    - term: str # The term, phrase, or abbreviation to define.
    - description: str # A description or definition of the term.
    - synonyms: list[str] # Alternative terms for the glossary entry.
- additional_descriptions: # A list of any other general instructions or content.
  - text: str # Any additional general instructions or context not covered elsewhere.

Le seguenti sezioni contengono esempi di componenti chiave delle istruzioni di sistema:

system_instruction
tables
relationships
glossaries
additional_descriptions

`system_instruction`

Utilizza la chiave system_instruction per definire il ruolo e la personalità dell'agente. Questa istruzione iniziale definisce il tono e lo stile delle risposte dell'API e aiuta l'agente a comprendere il suo scopo principale.

Ad esempio, puoi definire un agente come analista delle vendite per un negozio e-commerce fittizio nel seguente modo:

- system_instruction: >-
    You are an expert sales analyst for a fictitious ecommerce store. You will answer questions about sales, orders, and customer data. Your responses should be concise and data-driven.

`tables`

Mentre definisci le proprietà fondamentali di una tabella (come la descrizione e i sinonimi) come contesto strutturato, puoi anche utilizzare la chiave tables all'interno delle istruzioni di sistema per fornire una logica di business supplementare. Per le origini dati BigQuery, ciò include l'utilizzo della chiave fields per definire aggregations predefiniti per colonne specifiche.

Il seguente blocco di codice YAML di esempio mostra come utilizzare la chiave tables all'interno delle istruzioni di sistema per nidificare i campi che forniscono indicazioni supplementari per la tabella bigquery-public-data.thelook_ecommerce.orders:

- tables:
  - table:
    - name: bigquery-public-data.thelook_ecommerce.orders
    - fields:
      - field:
        - name: num_of_items
        - aggregations: 'sum, avg'

`relationships`

La chiave relationships nelle istruzioni di sistema contiene un elenco di relazioni di unione tra le tabelle. Definendo le relazioni di join, puoi aiutare l'agente a capire come unire i dati di più tabelle quando risponde alle domande.

Ad esempio, puoi definire una relazione orders_to_user tra la tabella bigquery-public-data.thelook_ecommerce.orders e la tabella bigquery-public-data.thelook_ecommerce.users nel seguente modo:

- relationships:
  - relationship:
    - name: orders_to_user
    - description: >-
        Connects customer order data to user information with the user_id and id fields to allow an aggregated view of sales by customer demographics.
    - relationship_type: many-to-one
    - join_type: left
    - left_table: bigquery-public-data.thelook_ecommerce.orders
    - right_table: bigquery-public-data.thelook_ecommerce.users
    - relationship_columns:
      - left_column: user_id
      - right_column: id

`glossaries`

La chiave glossaries nelle istruzioni di sistema elenca le definizioni di termini aziendali, gergo e abbreviazioni pertinenti ai tuoi dati e al tuo caso d'uso. Fornendo le definizioni del glossario, puoi aiutare l'agente a interpretare e rispondere con precisione alle domande che utilizzano un linguaggio aziendale specifico.

Ad esempio, puoi definire termini come stati aziendali comuni e "OMPF" in base al contesto aziendale specifico nel seguente modo:

- glossaries:
  - glossary:
    - term: complete
    - description: Represents an order status where the order has been completed.
    - synonyms: 'finish, done, fulfilled'
  - glossary:
    - term: shipped
    - description: Represents an order status where the order has been shipped to the customer.
  - glossary:
    - term: returned
    - description: Represents an order status where the customer has returned the order.
  - glossary:
    - term: OMPF
    - description: Order Management and Product Fulfillment

`additional_descriptions`

Utilizza il tasto additional_descriptions per fornire istruzioni o contesto generali che non rientrano in altri campi di contesto strutturato o istruzioni di sistema. Fornendo descrizioni aggiuntive nelle istruzioni di sistema, puoi aiutare l'agente a comprendere meglio il contesto dei tuoi dati e del tuo caso d'uso.

Ad esempio, puoi utilizzare il tasto additional_descriptions per fornire informazioni sulla tua organizzazione come segue:

- additional_descriptions:
  - text: All the sales data pertains to The Look, a fictitious ecommerce store.
  - text: 'Orders can be of three categories: food, clothes, and electronics.'

Esempio: contesto creato per un agente di vendita

Il seguente esempio per un agente analista delle vendite fittizio mostra come fornire un contesto creato utilizzando una combinazione di contesto strutturato e istruzioni di sistema.

Esempio: contesto strutturato

Puoi fornire un contesto strutturato con dettagli su tabelle, colonne e query di esempio per guidare l'agente, come mostrato nei seguenti esempi di SDK HTTP e Python.

HTTP

L'esempio seguente mostra come definire il contesto strutturato in una richiesta HTTP:

{
  "bq": {
    "tableReferences": [
      {
        "projectId": "bigquery-public-data",
        "datasetId": "thelook_ecommerce",
        "tableId": "orders",
        "schema": {
          "description": "Data for orders in The Look, a fictitious ecommerce store.",
          "synonyms": ["sales"],
          "tags": ["sale", "order", "sales_order"],
          "fields": [
            {
              "name": "status",
              "description": "The current status of the order."
            },
            {
              "name": "num_of_items",
              "description": "The number of items in the order."
            }
          ]
        }
      },
      {
        "projectId": "bigquery-public-data",
        "datasetId": "thelook_ecommerce",
        "tableId": "users",
        "schema": {
          "description": "Data for users in The Look, a fictitious ecommerce store.",
          "synonyms": ["customers"],
          "tags": ["user", "customer", "buyer"],
          "fields": [
            {
              "name": "first_name",
              "description": "The first name of the user.",
              "tags": ["person"]
            },
            {
              "name": "last_name",
              "description": "The last name of the user.",
              "tags": ["person"]
            },
            {
              "name": "age_group",
              "description": "The age demographic group of the user."
            },
            {
              "name": "email",
              "description": "The email address of the user.",
              "tags": ["contact"]
            }
          ]
        }
      }
    ]
  },
  "example_queries": [
    {
      "naturalLanguageQuestion": "How many orders are there?",
      "sqlQuery": "SELECT COUNT(*) FROM `bigquery-public-data.thelook_ecommerce.orders`"
    },
    {
      "naturalLanguageQuestion": "How many orders were shipped?",
      "sqlQuery": "SELECT COUNT(*) FROM `bigquery-public-data.thelook_ecommerce.orders` WHERE status = 'shipped'"
    },
    {
      "naturalLanguageQuestion": "How many unique customers are there?",
      "sqlQuery": "SELECT COUNT(DISTINCT id) FROM `bigquery-public-data.thelook_ecommerce.users`"
    },
    {
      "naturalLanguageQuestion": "How many users in the 25-34 age group have a cymbalgroup email address?",
      "sqlQuery": "SELECT COUNT(DISTINCT id) FROM `bigquery-public-data.thelook_ecommerce.users` WHERE users.age_group = '25-34' AND users.email LIKE '%@cymbalgroup.com'"
    }
  ]
}

SDK Python

L'esempio seguente mostra come definire il contesto strutturato con l'SDK Python:

# Define context for the 'orders' table
bigquery_table_reference_1 = geminidataanalytics.BigQueryTableReference()
bigquery_table_reference_1.project_id = "bigquery-public-data"
bigquery_table_reference_1.dataset_id = "thelook_ecommerce"
bigquery_table_reference_1.table_id = "orders"

bigquery_table_reference_1.schema = geminidataanalytics.Schema()
bigquery_table_reference_1.schema.description = "Data for orders in The Look, a fictitious ecommerce store."
bigquery_table_reference_1.schema.synonyms = ["sales"]
bigquery_table_reference_1.schema.tags = ["sale", "order", "sales_order"]
bigquery_table_reference_1.schema.fields = [
    geminidataanalytics.Field(
        name="status",
        description="The current status of the order.",
    ),
    geminidataanalytics.Field(
        name="num_of_items",
        description="The number of items in the order."
    )
]

# Define context for the 'users' table
bigquery_table_reference_2 = geminidataanalytics.BigQueryTableReference()
bigquery_table_reference_2.project_id = "bigquery-public-data"
bigquery_table_reference_2.dataset_id = "thelook_ecommerce"
bigquery_table_reference_2.table_id = "users"

bigquery_table_reference_2.schema = geminidataanalytics.Schema()
bigquery_table_reference_2.schema.description = "Data for users in The Look, a fictitious ecommerce store."
bigquery_table_reference_2.schema.synonyms = ["customers"]
bigquery_table_reference_2.schema.tags = ["user", "customer", "buyer"]
bigquery_table_reference_2.schema.fields = [
    geminidataanalytics.Field(
        name="first_name",
        description="The first name of the user.",
        tags=["person"],
    ),
    geminidataanalytics.Field(
        name="last_name",
        description="The last name of the user.",
        tags=["person"],
    ),
    geminidataanalytics.Field(
        name="age_group",
        description="The age demographic group of the user.",
    ),
    geminidataanalytics.Field(
        name="email",
        description="The email address of the user.",
        tags=["contact"],
    )
]

# Define example queries
example_queries = [
  geminidataanalytics.ExampleQuery(
      natural_language_question="How many orders are there?",
      sql_query="SELECT COUNT(*) FROM `bigquery-public-data.thelook_ecommerce.orders`",
  ),
  geminidataanalytics.ExampleQuery(
      natural_language_question="How many orders were shipped?",
      sql_query="SELECT COUNT(*) FROM `bigquery-public-data.thelook_ecommerce.orders` WHERE status = 'shipped'",
  ),
  geminidataanalytics.ExampleQuery(
      natural_language_question="How many unique customers are there?",
      sql_query="SELECT COUNT(DISTINCT id) FROM `bigquery-public-data.thelook_ecommerce.users`",
  ),
  geminidataanalytics.ExampleQuery(
      natural_language_question="How many users in the 25-34 age group have a cymbalgroup email address?",
      sql_query="SELECT COUNT(DISTINCT id) FROM `bigquery-public-data.thelook_ecommerce.users` WHERE users.age_group = '25-34' AND users.email LIKE '%@cymbalgroup.com'",
  )
]

Esempio: istruzioni di sistema

Le seguenti istruzioni di sistema integrano il contesto strutturato definendo la persona dell'agente e fornendo indicazioni non supportate dai campi strutturati, come definizioni di relazioni, termini del glossario, descrizioni aggiuntive e dettagli della tabella orders supplementare. In questo esempio, poiché la tabella users è completamente definita con un contesto strutturato, non è necessario ridefinirla nelle istruzioni di sistema.

- system_instruction: >-
    You are an expert sales analyst for a fictitious ecommerce store. You will answer questions about sales, orders, and customer data. Your responses should be concise and data-driven.
- tables:
    - table:
        - name: bigquery-public-data.thelook_ecommerce.orders
        - fields:
            - field:
                - name: num_of_items
                - aggregations: 'sum, avg'
- relationships:
    - relationship:
        - name: orders_to_user
        - description: >-
            Connects customer order data to user information with the user_id and id fields.
        - relationship_type: many-to-one
        - join_type: left
        - left_table: bigquery-public-data.thelook_ecommerce.orders
        - right_table: bigquery-public-data.thelook_ecommerce.users
        - relationship_columns:
            - left_column: user_id
            - right_column: id
- glossaries:
    - glossary:
        - term: complete
        - description: Represents an order status where the order has been completed.
        - synonyms: 'finish, done, fulfilled'
    - glossary:
        - term: OMPF
        - description: Order Management and Product Fulfillment
- additional_descriptions:
    - text: All the sales data pertains to The Look, a fictitious ecommerce store.

Definisci il contesto dell'agente dati per le origini dati BigQuery Mantieni tutto organizzato con le raccolte Salva e classifica i contenuti in base alle tue preferenze.

Definisci i campi di contesto strutturati

Contesto strutturato a livello di tabella

HTTP

SDK Python

Contesto strutturato a livello di colonna

HTTP

SDK Python

Esempi di query

HTTP

SDK Python

Definisci un contesto aggiuntivo nelle istruzioni di sistema

system_instruction

tables

relationships

glossaries

additional_descriptions

Esempio: contesto creato per un agente di vendita

Esempio: contesto strutturato

HTTP

SDK Python

Esempio: istruzioni di sistema

Definisci il contesto dell'agente dati per le origini dati BigQuery

`system_instruction`

`tables`

`relationships`

`glossaries`

`additional_descriptions`