本教程介绍如何使用 Google Cloud 控制台设置和使用 AlloyDB AI Natural Language API。您将了解如何配置 AlloyDB AI Natural Language API,以便您可以提出自然语言问题并接收 SQL 查询和结果。
本教程中的示例仅用于演示目的。
前提条件
启用结算功能及所需 API
在 Google Cloud 控制台中,选择一个项目。
启用创建和连接到 AlloyDB for PostgreSQL 所需的 Cloud API。
- 在确认项目步骤中,点击下一步以确认您要更改的项目的名称。
在启用 API 步骤中,点击启用以启用以下内容:
- AlloyDB API
创建并连接到数据库
- 创建集群及其主实例。
- 连接到您的实例并创建数据库。
- 启用 Vertex AI 集成。如需了解详情,请参阅与 Vertex AI 集成。
启用并安装所需扩展程序
如需安装和使用 alloydb_ai_nl extension
,您必须先使用 alloydb_ai_nl.enabled
标志启用该扩展程序。如需了解详情,请参阅配置实例的数据库标志。
如需安装 alloydb_ai_nl
扩展程序(即 AlloyDB AI 支持的 Natural Language API),请运行以下查询:
CREATE EXTENSION alloydb_ai_nl cascade;
由于在主实例上创建的 AlloyDB AI 自然语言对象会传播到只读副本和跨区域副本,因此请确保在每个 AlloyDB for PostgreSQL 实例上启用 AlloyDB AI 自然语言标志。
升级 alloydb_ai_nl 扩展程序
如果您已安装该扩展程序,请运行以下语句将其更新到最新版本:
ALTER EXTENSION alloydb_ai_nl UPDATE;
创建 nla_demo 架构及表
在以下步骤中,您将在架构中创建 nla_demo
架构和表。您可以使用合成数据填充表格。所提供的架构和数据旨在支持线上零售业务的基本运营,其潜在应用范围可扩展到客户管理、分析、营销和运营方面。
示例数据展示了如何使用 AlloyDB AI 自然语言进行开发、测试和演示,尤其是自然语言界面等功能。
运行以下查询来创建架构:
CREATE SCHEMA nla_demo;
在
nla_demo
架构中创建表。addresses
表存储客户和订单的地址信息。CREATE TABLE nla_demo.addresses ( address_id SERIAL PRIMARY KEY, street_address VARCHAR(255) NOT NULL, city VARCHAR(255) NOT NULL, country VARCHAR(255) );
通过运行以下查询,来创建
customers
表。此表存储客户信息,包括客户 ID、姓名、详细联系信息、地址参考信息、出生日期和记录创建时间。CREATE TABLE nla_demo.customers ( customer_id SERIAL PRIMARY KEY, first_name VARCHAR(255) NOT NULL, last_name VARCHAR(255) NOT NULL, email VARCHAR(255) UNIQUE NOT NULL, address_id INTEGER REFERENCES nla_demo.addresses(address_id), date_of_birth DATE, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP );
创建用于存储商品类别的
categories
表。CREATE TABLE nla_demo.categories ( category_id INTEGER PRIMARY KEY, category_name VARCHAR(255) UNIQUE NOT NULL );
创建用于存储品牌名称的
brands
表。CREATE TABLE nla_demo.brands ( brand_id INTEGER PRIMARY KEY, brand_name VARCHAR(255) NOT NULL );
创建
products
表,用于存储商品信息,例如商品 ID、名称、说明、品牌、类别关联和记录创建时间。CREATE TABLE nla_demo.products ( product_id INTEGER PRIMARY KEY, name VARCHAR(255) NOT NULL, description TEXT DEFAULT 'Not available', brand_id INTEGER REFERENCES nla_demo.brands(brand_id), category_id INTEGER REFERENCES nla_demo.categories(category_id), created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, price DECIMAL(10, 2), description_embedding VECTOR(768) );
创建
orders
表。此表存储有关客户订单的信息,包括客户、日期、总金额、送货地址和账单邮寄地址以及订单状态。CREATE TABLE nla_demo.orders ( order_id INTEGER PRIMARY KEY, customer_id INTEGER REFERENCES nla_demo.customers(customer_id), order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP, total_amount DECIMAL(10, 2) NOT NULL, shipping_address_id INTEGER REFERENCES nla_demo.addresses(address_id), billing_address_id INTEGER REFERENCES nla_demo.addresses(address_id), order_status VARCHAR(50) );
创建
order_items
表。此表记录订单中的各个商品,链接到订单和商品款式,并指定数量和价格。CREATE TABLE nla_demo.order_items ( order_item_id SERIAL PRIMARY KEY, order_id INTEGER REFERENCES nla_demo.orders(order_id), product_id INTEGER REFERENCES nla_demo.products(product_id), quantity INTEGER NOT NULL, price DECIMAL(10, 2) NOT NULL );
填充 nla_demo 架构中的表
通过运行以下查询,来填充
addresses
表。INSERT INTO nla_demo.addresses (street_address, city, country) VALUES ('1800 Amphibious Blvd', 'Mountain View', 'USA'), ('Avenida da Pastelaria, 1903', 'Lisbon', 'Portugal'), ('8 Rue du Nom Fictif 341', 'Paris', 'France');
填充
customers
表。INSERT INTO nla_demo.customers (first_name, last_name, email, address_id, date_of_birth) VALUES ('Alex', 'B.', 'alex.b@example.com', 1, '2003-02-20'), ('Amal', 'M.', 'amal.m@example.com', 2, '1998-11-08'), ('Dani', 'G.', 'dani.g@example.com', 3, '2002-07-25');
填充
categories
表。INSERT INTO nla_demo.categories (category_id, category_name) VALUES (1, 'Accessories'), (2, 'Apparel'), (3, 'Footwear'), (4, 'Swimwear');
填充
brands
表。INSERT INTO nla_demo.brands (brand_id, brand_name) VALUES (1, 'CymbalPrime'), (2, 'CymbalPro'), (3, 'CymbalSports');
填充
products
表。INSERT INTO nla_demo.products (product_id, brand_id, category_id, name, description, price) VALUES (1, 1, 2, 'Hoodie', 'A comfortable, casual sweatshirt with an attached hood.', 79.99), (2, 1, 3, 'Running Shoes', 'Lightweight, cushioned footwear designed for the impact of running.', 99.99), (3, 2, 4, 'Swimsuit', 'A garment designed for swimming or other water activities.', 20.00), (4, 3, 1, 'Tote Bag', 'A large, unfastened bag with two parallel handles.', 69.99), (5, 3, 3, 'CymbalShoe', 'Footwear from Cymbal, designed for your life''s rhythm.', 89.99); UPDATE nla_demo.products SET description_embedding = embedding('text-embedding-004', description);
填充
orders
表。INSERT INTO nla_demo.orders (order_id, customer_id, total_amount, shipping_address_id, billing_address_id, order_status) VALUES (1, 1, 99.99, 1, 1, 'Shipped'), (2, 1, 69.99, 1, 1, 'Delivered'), (3, 2, 20.99, 2, 2, 'Processing'), (4, 3, 79.99, 3, 3, 'Shipped');
填充
order_items
表。INSERT INTO nla_demo.order_items (order_id, product_id, quantity, price) VALUES (1, 1, 1, 79.99), (1, 3, 1, 20.00), (2, 4, 1, 69.99), (3, 3, 1, 20.00), (4, 2, 1, 79.99);
创建自然语言配置
如需使用 AlloyDB AI 自然语言,请确保配置 Vertex AI 端点。然后,您需要创建配置并注册架构。
g_alloydb_ai_nl.g_create_configuration
会创建模型。
创建自然语言配置。
SELECT alloydb_ai_nl.g_create_configuration( 'nla_demo_cfg' );
将表注册到
nla_demo_cfg
配置。SELECT alloydb_ai_nl.g_manage_configuration( operation => 'register_table_view', configuration_id_in => 'nla_demo_cfg', table_views_in=>'{nla_demo.customers, nla_demo.addresses, nla_demo.brands, nla_demo.products, nla_demo.categories, nla_demo.orders, nla_demo.order_items}' );
为表和列创建并应用上下文
为了针对自然语言问题提供准确的答案,您可以使用 AlloyDB AI Natural Language API 提供有关表、视图和列的上下文。您可以使用 AlloyDB AI Natural Language API 的自动上下文生成功能从表和列生成上下文,并应用作为 COMMENTS
附加到表、视图和列的上下文。
如需为在
nla_demo_cfg
配置中注册的表及其列生成架构上下文,请运行以下命令:SELECT alloydb_ai_nl.generate_schema_context( 'nla_demo_cfg', TRUE );
上述查询会使用上下文填充
alloydb_ai_nl.generated_schema_context_view
视图。传递TRUE
会覆盖此视图中之前运行的上下文。如需验证为
nla_demo.products
表生成的上下文,请运行以下查询:SELECT object_context FROM alloydb_ai_nl.generated_schema_context_view WHERE schema_object = 'nla_demo.products';
生成的上下文类似于以下内容:
The products table stores information about products, including their name, a brief description, the brand they belong to (referenced by brand_id), and the category they fall under (referenced by category_id). Each product has a unique identifier (product_id) and a timestamp indicating its creation time (created_at).
如需验证为列(例如
nla_demo.products.name
)生成的上下文,请运行以下命令:SELECT object_context FROM alloydb_ai_nl.generated_schema_context_view WHERE schema_object = 'nla_demo.products.name';
查询输出类似于以下内容:
The name column in the nla_demo.products table contains the specific name or title of each product. This is a short, descriptive text string that clearly identifies the product, like "Hoodie," "Tote Bag," "Running Shoes," or "Swimsuit." It helps distinguish individual products within the broader context of their brand and category. The name column specifies the exact product. This column is essential for users and systems to identify and refer to specific products within the database.
在
alloydb_ai_nl.generated_schema_context_view
视图中查看生成的上下文,并更新需要修订的上下文。SELECT alloydb_ai_nl.update_generated_relation_context( 'nla_demo.products', 'The "nla_demo.products" table stores product details such as ID, name, description, brand, category linkage, and record creation time.' ); SELECT alloydb_ai_nl.update_generated_column_context( 'nla_demo.products.name', 'The "name" column in the "nla_demo.products" table contains the specific name or title of each product.' );
应用要附加到相应对象的生成的上下文:
SELECT alloydb_ai_nl.apply_generated_relation_context( 'nla_demo.products', true ); SELECT alloydb_ai_nl.apply_generated_column_context( 'nla_demo.products.name', true );
alloydb_ai_nl.generated_schema_context_view
视图中的生成的上下文条目会应用于相应的架构对象,并且会覆盖掉注释。
构建值索引
AlloyDB AI Natural Language API 使用值关联来生成准确的 SQL 查询。值关联会将自然语言语句中的值短语与预注册的概念类型和列名称关联,从而可以丰富自然语言问题。
例如,如果 Hoodie
与 product_name
概念关联,而后者又与 nla_demo.products.name
关联,那么系统可以更准确地回答“告诉我连帽衫的价格”这一问题。
列。
如需定义
product_name
概念类型并将其与nla_demo.products.name
列关联,请运行以下查询:SELECT alloydb_ai_nl.add_concept_type( concept_type_in => 'product_name', match_function_in => 'alloydb_ai_nl.get_concept_and_value_generic_entity_name', additional_info_in => '{ "description": "Concept type for product name.", "examples": "SELECT alloydb_ai_nl.get_concept_and_value_generic_entity_name(''Hoodie'')" }'::jsonb ); SELECT alloydb_ai_nl.associate_concept_type( 'nla_demo.products.name', 'product_name', 'nla_demo_cfg' );
如需验证
product_name
概念类型是否已添加到概念类型列表中,请运行以下查询,确保product_name
包含在此查询的结果中:SELECT alloydb_ai_nl.list_concept_types();
如需验证
nla_demo.products.name
列是否与product_name
概念类型关联,请运行以下查询:SELECT * FROM alloydb_ai_nl.value_index_columns WHERE column_names = 'nla_demo.products.name';
如需定义
brand_name
概念类型并将其与nla_demo.brands.brand_name
列关联,请运行以下查询:SELECT alloydb_ai_nl.add_concept_type( concept_type_in => 'brand_name', match_function_in => 'alloydb_ai_nl.get_concept_and_value_generic_entity_name', additional_info_in => '{ "description": "Concept type for brand name.", "examples": "SELECT alloydb_ai_nl.get_concept_and_value_generic_entity_name(''CymbalPrime'')" }'::jsonb ); SELECT alloydb_ai_nl.associate_concept_type( 'nla_demo.brands.brand_name', 'brand_name', 'nla_demo_cfg' );
定义概念类型并将列与其关联后,请创建值索引。
SELECT alloydb_ai_nl.create_value_index('nla_demo_cfg'); SELECT alloydb_ai_nl.refresh_value_index('nla_demo_cfg');
自动生成概念类型关联
借助 AlloyDB AI 自然语言,您可以根据现有的概念类型自动生成关联,而无需手动将概念类型与列相关联,例如手动调用 alloydb_ai_nl.associate_concept_type
。
如需自动生成概念类型关联,请按以下步骤操作:
为
nla_demo_cfg
范围内的所有关系生成关联:SELECT alloydb_ai_nl.generate_concept_type_associations('nla_demo_cfg');
查看生成的关联。
SELECT * from alloydb_ai_nl.generated_value_index_columns_view;
结果类似于以下内容。系统会同时考虑内置概念和用户定义的概念。
-[ RECORD 1 ]---+----------------------------------------------------------- id | 1 config | nla_demo_cfg column_names | nla_demo.addresses.city concept_type | city_name additional_info | {} -[ RECORD 2 ]---+----------------------------------------------------------- id | 2 config | nla_demo_cfg column_names | nla_demo.addresses.country concept_type | country_name additional_info | {} -[ RECORD 3 ]---+----------------------------------------------------------- id | 3 config | nla_demo_cfg column_names | nla_demo.customers.first_name,nla_demo.customers.last_name concept_type | full_person_name additional_info | {} -[ RECORD 4 ]---+----------------------------------------------------------- id | 4 config | nla_demo_cfg column_names | nla_demo.brands.brand_name concept_type | brand_name additional_info | {} -[ RECORD 5 ]---+----------------------------------------------------------- id | 5 config | nla_demo_cfg column_names | nla_demo.products.name concept_type | product_name additional_info | {} .... ```
可选:更新或舍弃生成的关联。
-- Update, NULL means keeping the original value. SELECT alloydb_ai_nl.update_generated_concept_type_associations( id => 1, column_names => NULL, concept_type => 'generic_entity_name', additional_info => NULL ); -- Drop SELECT alloydb_ai_nl.drop_generated_concept_type_association(id => 1);
应用生成的关联。
SELECT alloydb_ai_nl.apply_generated_concept_type_associations('nla_demo_cfg');
如需反映这些更改,请刷新值索引。
SELECT alloydb_ai_nl.refresh_value_index('nla_demo_cfg');
定义查询模板
您可以定义模板,以提高 AlloyDB AI Natural Language API 生成的答案的质量。
如需为业务关键问题提供示例模板,并提供预期高度准确的预期问题,请运行以下查询以添加模板:
SELECT alloydb_ai_nl.add_template( nl_config_id => 'nla_demo_cfg', intent => 'List the first names and the last names of all customers who ordered Swimsuit.', sql => 'SELECT c.first_name, c.last_name FROM nla_demo.Customers c JOIN nla_demo.orders o ON c.customer_id = o.customer_id JOIN nla_demo.order_items oi ON o.order_id = oi.order_id JOIN nla_demo.products p ON oi.product_id = p.product_id AND p.name = ''Swimsuit''', sql_explanation => 'To answer this question, JOIN `nla_demo.Customers` with `nla_demo.orders` on having the same `customer_id`, and JOIN the result with nla_demo.order_items on having the same `order_id`. Then JOIN the result with `nla_demo.products` on having the same `product_id`, and filter rwos that with p.name = ''Swimsuit''. Return the `first_name` and the `last_name` of the customers with matching records.', check_intent => TRUE );
如需查看已添加的模板的列表,请查询视图
alloydb_ai_nl.template_store_view
:SELECT nl, sql, intent, psql, pintent FROM alloydb_ai_nl.template_store_view WHERE config = 'nla_demo_cfg';
系统会返回以下输出:
nl | List the first names and the last names of all customers who ordered Swimsuit. sql | SELECT c.first_name, c.last_name | FROM nla_demo.Customers c | JOIN nla_demo.orders o ON c.customer_id = o.customer_id | JOIN nla_demo.order_items oi ON o.order_id = oi.order_id | JOIN nla_demo.products p ON oi.product_id = p.product_id | AND p.name = 'Swimsuit' intent | List the first names and the last names of all customers who ordered | Swimsuit. psql | SELECT c.first_name, c.last_name | FROM nla_demo.Customers c JOIN nla_demo.orders o | ON c.customer_id = o.customer_id | JOIN nla_demo.order_items oi ON o.order_id = oi.order_id | JOIN nla_demo.products p ON oi.product_id = p.product_id | AND p.name = $1 pintent | List the first names and the last names of all customers who ordered | $1.
在此模板中,与
psql
属性对应的值是参数化 SQL 查询,而pintent
列的值是参数化 intent 语句。最近添加的模板的 ID 可能不同,具体取决于您之前添加的模板。模板用于针对问题提供高度准确的答案。如需创建使用语义搜索的模板,请运行以下示例语句:
SELECT alloydb_ai_nl.add_template( nl_config_id => 'nla_demo_cfg', intent => 'List 3 products most similar to a Swimwear.', sql => $$SELECT name FROM nla_demo.products ORDER BY description_embedding <=> embedding('text-embedding-004', 'Swimwear')::vector$$, sql_explanation => $$To answer this question, ORDER products in `nla_demo.products` , based by their distance of the descrption_embedding of the product with the embedding of 'Swimwear'.$$, check_intent => TRUE );
上述模板会向视图
alloydb_ai_nl.template_store_view
添加以下行:nl | List 3 products most similar to a Swimwear. sql | SELECT name FROM nla_demo.products | ORDER BY description_embedding <=> | embedding('text-embedding-004', 'Swimwear')::vector intent | List 3 products most similar to a Swimwear. psql | SELECT name FROM nla_demo.products | ORDER BY description_embedding <=> | embedding('text-embedding-004', $1)::vector pintent | List 3 products most similar to a $1.
定义查询 fragment
您可以定义片段,以提高 AlloyDB AI Natural Language API 生成的答案的质量。
如需为业务关键谓词提供 fragment,并提供预期高度准确的预期条件,请运行以下查询:
SELECT alloydb_ai_nl.add_fragment(
nl_config_id => 'nla_demo_cfg',
table_aliases => ARRAY['nla_demo.products AS T'],
intent => 'luxury product',
fragment => $$description LIKE '%luxury%' OR description LIKE '%premium%' OR description LIKE '%exclusive%' OR description LIKE '%high-end%' OR description LIKE '%finest%' OR description LIKE '%elite%' OR description LIKE '%deluxe%'$$);
如需查看已添加的 fragment 的列表,请查询视图 alloydb_ai_nl.fragment_store_view
:
SELECT intent, fragment, pintent
FROM alloydb_ai_nl.fragment_store_view;
系统会返回以下输出:
intent | luxury product
fragment | description LIKE '%luxury%' OR description LIKE '%premium%' OR description LIKE '%exclusive%' OR description LIKE '%high-end%' OR description LIKE '%finest%' OR description LIKE '%elite%' OR description LIKE '%deluxe%'
pintent | luxury product
通过自然语言问题生成 SQL 结果
如需使用 AlloyDB AI Natural Language API 生成 SQL 查询和结果集,请运行以下查询:
SELECT alloydb_ai_nl.get_sql( 'nla_demo_cfg', 'Find the customers who purchased Tote Bag.' ) ->> 'sql';
系统会返回以下输出:
SELECT DISTINCT "c"."first_name", "c"."last_name" FROM "nla_demo"."customers" AS "c" JOIN "nla_demo"."orders" AS "o" ON "c"."customer_id" = "o"."customer_id" JOIN "nla_demo"."order_items" AS "oi" ON "o"."order_id" = "oi"."order_id" JOIN "nla_demo"."products" AS "p" ON "oi"."product_id" = "p"."product_id" WHERE "p"."name" = 'Tote Bag';
JSON 输出是使用您在定义查询模板步骤添加的模板的 SQL 查询。
如需使用 AlloyDB AI Natural Language API 生成 SQL 查询,请运行以下查询:
SELECT alloydb_ai_nl.get_sql( 'nla_demo_cfg', 'List the maximum price of any CymbalShoe.' ) ->> 'sql';
系统会返回以下输出:
SELECT max("price") FROM "nla_demo"."products" WHERE "name" = 'CymbalShoe'
AlloyDB AI Natural Language API 通过使用值索引来识别出
CymbalShoe
是产品的名称。运行以下查询,将CymbalShoe
替换为品牌名称 (CymbalPrime
):SELECT alloydb_ai_nl.get_sql( 'nla_demo_cfg', 'List the maximum price of any CymbalPrime.' ) ->> 'sql';
系统会生成以下输出:
SELECT max("price") FROM "nla_demo"."products" AS t1 JOIN "nla_demo"."brands" AS t2 ON t1."brand_id" = t2."brand_id" WHERE t2."brand_name" = 'CymbalPrime';
AlloyDB AI 会使用构建值索引步骤创建的值索引将
CymbalPrime
解析为brand_name
概念类型,并会使用与brand_name
关联的nla_demo.brands.brand_name
列。如需使用 AlloyDB AI Natural Language API 生成问题的结果,请运行以下查询:
SELECT alloydb_ai_nl.execute_nl_query( 'nla_demo_cfg', 'Find the last name of the customers who live in Lisbon.' );
系统会返回以下输出:
execute_nl_query -------------------------- {"last_name":"M."}
如需使用 AlloyDB AI Natural Language API 生成使用语义搜索的 SQL 语句,请运行以下查询:
SELECT alloydb_ai_nl.get_sql( 'nla_demo_cfg', 'List 2 products similar to a Tote Bag.');
系统会返回以下 SQL 语句:
SELECT name FROM nla_demo.products ORDER BY description_embedding <=> embedding( 'text-embedding-004', 'Tote Bag')::vector LIMIT 2;
获取 SQL 摘要
您可以根据数据库中存储的数据,通过自然语言问题生成结果摘要。这有助于最终用户直接提出自然语言问题来了解数据。
如需获取 SQL 摘要,请运行以下示例查询:
SELECT
alloydb_ai_nl.get_sql_summary(
nl_config_id => 'nla_demo_cfg',
nl_question => 'which brands have the largest number of products.'
);
此查询会返回一个 JSON 对象作为输出,类似于以下内容:
"answer": "The result set lists three brands: CymbalSports, CymbalPro, and CymbalPrime. Each brand is represented once, suggesting an equal distribution of products across these three brands within the dataset."