使用托管式 AI 函数执行语义分析
本教程介绍了如何使用 BigQuery ML 托管式 AI 函数对客户反馈执行语义分析。
目标
在本教程中,您将执行以下操作:
- 创建数据集并将情感数据加载到表中
- 创建云资源连接
- 使用以下 AI 函数执行语义分析:
AI.IF:使用自然语言条件过滤数据AI.SCORE:按情感对输入内容进行评分AI.CLASSIFY:将输入内容分类到用户定义的类别中
费用
本教程使用 Google Cloud的可计费组件,包括以下组件:
- BigQuery
- BigQuery ML
如需了解有关 BigQuery 费用的更多信息,请参阅 BigQuery 价格页面。
如需详细了解 BigQuery ML 费用,请参阅 BigQuery ML 价格。
准备工作
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator role
(
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles.
-
如果您要使用现有项目来完成本指南,请验证您是否拥有完成本指南所需的权限。如果您创建了新项目,则您已拥有所需的权限。
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator role
(
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles.
-
如果您要使用现有项目来完成本指南,请验证您是否拥有完成本指南所需的权限。如果您创建了新项目,则您已拥有所需的权限。
-
Enable the BigQuery API and BigQuery Connection API APIs.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission. Learn how to grant roles.对于新项目,BigQuery API 会自动启用。
- 可选:为项目启用结算功能。如果您不想启用结算功能或提供信用卡,本文档中的步骤仍然有效。BigQuery 提供执行这些步骤的沙盒。如需了解详情,请参阅启用 BigQuery 沙盒。
-
运行查询作业和加载作业:
BigQuery Job User (
roles/bigquery.jobUser) -
创建连接:BigQuery Connection Admin (
roles/bigquery.connectionAdmin) -
创建数据集、创建表、将数据加载到表中以及查询表:
BigQuery Data Editor (
roles/bigquery.dataEditor) 前往 BigQuery 页面。
在探索器窗格中,点击 添加数据:
系统随即会打开添加数据对话框。
在过滤条件窗格中的数据源类型部分,选择企业应用。
或者,在搜索数据源字段中,您可以输入
Vertex AI。在精选数据源部分中,点击 Vertex AI。
点击 Vertex AI 模型:BigQuery 联合解决方案卡片。
在连接类型列表中,选择 Vertex AI 远程模型、远程函数、BigLake 和 Spanner(Cloud 资源)。
在连接 ID 字段中,输入连接的名称。
点击创建连接。
点击转到连接。
在连接信息窗格中,复制服务账号 ID 以在后续步骤中使用。
在命令行环境中,创建连接:
bq mk --connection --location=REGION --project_id=PROJECT_ID \ --connection_type=CLOUD_RESOURCE CONNECTION_ID
--project_id参数会替换默认项目。请替换以下内容:
REGION:您的连接区域PROJECT_ID:您的 Google Cloud 项目 IDCONNECTION_ID:您的连接的 ID
当您创建连接资源时,BigQuery 会创建一个唯一的系统服务账号,并将其与该连接相关联。
问题排查:如果您收到以下连接错误,请更新 Google Cloud SDK:
Flags parsing error: flag --connection_type=CLOUD_RESOURCE: value should be one of...
检索并复制服务账号 ID 以在后续步骤中使用:
bq show --connection PROJECT_ID.REGION.CONNECTION_ID
输出类似于以下内容:
name properties 1234.REGION.CONNECTION_ID {"serviceAccountId": "connection-1234-9u56h9@gcp-sa-bigquery-condel.iam.gserviceaccount.com"}- 启动 Cloud Shell。
-
设置要应用 Terraform 配置的默认 Google Cloud 项目。
您只需为每个项目运行一次以下命令,即可在任何目录中运行它。
export GOOGLE_CLOUD_PROJECT=PROJECT_ID
如果您在 Terraform 配置文件中设置显式值,则环境变量会被替换。
-
在 Cloud Shell 中,创建一个目录,并在该目录中创建一个新文件。文件名必须具有
.tf扩展名,例如main.tf。在本教程中,该文件称为main.tf。mkdir DIRECTORY && cd DIRECTORY && touch main.tf
-
如果您按照教程进行操作,可以在每个部分或步骤中复制示例代码。
将示例代码复制到新创建的
main.tf中。(可选)从 GitHub 中复制代码。如果端到端解决方案包含 Terraform 代码段,则建议这样做。
- 查看和修改要应用到您的环境的示例参数。
- 保存更改。
-
初始化 Terraform。您只需为每个目录执行一次此操作。
terraform init
(可选)如需使用最新的 Google 提供程序版本,请添加
-upgrade选项:terraform init -upgrade
-
查看配置并验证 Terraform 将创建或更新的资源是否符合您的预期:
terraform plan
根据需要更正配置。
-
通过运行以下命令并在提示符处输入
yes来应用 Terraform 配置:terraform apply
等待 Terraform 显示“应用完成!”消息。
- 打开您的 Google Cloud 项目以查看结果。在 Google Cloud 控制台的界面中找到资源,以确保 Terraform 已创建或更新它们。
前往 IAM 和管理页面。
点击 授予访问权限。
在新的主账号字段中,输入您之前复制的服务账号 ID。
在选择角色字段中,选择 Vertex AI,然后选择 Vertex AI User 角色。
点击保存。
- 根据评价衡量客户满意度。
- 在社交媒体上监控品牌认知度。
- 根据用户的不满程度确定支持服务工单的优先级。
- 发现客户反馈中的常见主题。
- 按主题内容整理文档。
- 按主题分发支持服务工单。
- 查找重复或近似重复的条目。
- 将相似的反馈归为一组。
- 为语义搜索应用提供支持。
- In the Google Cloud console, go to the Manage resources page.
- In the project list, select the project that you want to delete, and then click Delete.
- In the dialog, type the project ID, and then click Shut down to delete the project.
所需的角色
如需获得使用 AI 函数所需的权限,请让您的管理员向您授予项目的以下 IAM 角色:
如需详细了解如何授予角色,请参阅管理对项目、文件夹和组织的访问权限。
创建示例数据
如需为本教程创建名为 my_dataset 的数据集,请运行以下查询。
CREATE SCHEMA my_dataset OPTIONS (location = 'LOCATION');
接下来,创建一个名为 customer_feedback 的表,其中包含某款设备的客户评价示例:
CREATE TABLE my_dataset.customer_feedback AS (
SELECT
*
FROM
UNNEST( [STRUCT<review_id INT64, review_text STRING>
(1, "The battery life is incredible, and the screen is gorgeous! Best phone I've ever had. Totally worth the price."),
(2, "Customer support was a nightmare. It took three weeks for my order to arrive, and when it did, the box was damaged. Very frustrating!"),
(3, "The product does exactly what it says on the box. No complaints, but not exciting either."),
(4, "I'm so happy with this purchase! It arrived early and exceeded all my expectations. The quality is top-notch, although the setup was a bit tricky."),
(5, "The price is a bit too high for what you get. The material feels cheap and I'm worried it won't last. Service was okay."),
(6, "Absolutely furious! The item arrived broken, and getting a refund is proving impossible. I will never buy from them again."),
(7, "This new feature for account access is confusing. I can't find where to update my profile. Please fix this bug!"),
(8, "The shipping was delayed, but the support team was very helpful and kept me informed. The product itself is great, especially for the price.")
])
);
创建连接
创建 Cloud 资源连接并获取连接的服务账号。
从下列选项中选择一项:
控制台
bq
Terraform
使用 google_bigquery_connection 资源。
如需向 BigQuery 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为客户端库设置身份验证。
以下示例在 US 区域中创建一个名为 my_cloud_resource_connection 的 Cloud 资源连接:
如需在 Google Cloud 项目中应用 Terraform 配置,请完成以下部分中的步骤。
准备 Cloud Shell
准备目录
每个 Terraform 配置文件都必须有自己的目录(也称为“根模块”)。
应用更改
向连接的服务账号授予权限
向连接的服务账号授予 Vertex AI User 角色。您必须在您在准备工作部分创建或选择的项目中授予此角色。在其他项目中授予此角色会导致错误 bqcx-1234567890-xxxx@gcp-sa-bigquery-condel.iam.gserviceaccount.com does not have the permission to access resource。
如需授予该角色,请按以下步骤操作:
对总体情感进行分类
提取文本中表达的总体情感有助于实现以下使用场景:
以下查询展示了如何使用 AI.CLASSIFY 函数将 customer_feedback 表中的评价分类为“正面”“负面”或“中性”:
SELECT
review_id,
review_text,
AI.CLASSIFY(
review_text,
categories => ['positive', 'negative', 'neutral'],
connection_id => "CONNECTION_ID") AS sentiment
FROM
my_dataset.customer_feedback;
结果类似于以下内容:
+-----------+------------------------------------------+-----------+ | review_id | review_text | sentiment | +-----------+------------------------------------------+-----------+ | 7 | This new feature for account access is | negative | | | confusing. I can't find where to update | | | | my profile. Please fix this bug! | | +-----------+------------------------------------------+-----------+ | 4 | "I'm so happy with this purchase! It | positive | | | arrived early and exceeded all my | | | | expectations. The quality is top-notch, | | | | although the setup was a bit tricky." | | +-----------+------------------------------------------+-----------+ | 2 | "Customer support was a nightmare. It | negative | | | took three weeks for my order to | | | | arrive, and when it did, the box was | | | | damaged. Very frustrating!" | | +-----------+------------------------------------------+-----------+ | 1 | "The battery life is incredible, and | positive | | | the screen is gorgeous! Best phone I've | | | | ever had. Totally worth the price." | | +-----------+------------------------------------------+-----------+ | 8 | "The shipping was delayed, but the | positive | | | support team was very helpful and kept | | | | me informed. The product itself is | | | | great, especially for the price." | | +-----------+------------------------------------------+-----------+ | 5 | The price is a bit too high for what | negative | | | you get. The material feels cheap and | | | | I'm worried it won't last. Service was | | | | okay. | | +-----------+------------------------------------------+-----------+ | 3 | "The product does exactly what it says | neutral | | | on the box. No complaints, but not | | | | exciting either." | | +-----------+------------------------------------------+-----------+ | 6 | "Absolutely furious! The item arrived | negative | | | broken, and getting a refund is proving | | | | impossible. I will never buy from them | | | | again." | | +-----------+------------------------------------------+-----------+
分析基于方面的情感
如果“正面”或“负面”等总体情感不足以满足您的使用场景需求,您可以分析文本含义的特定方面。例如,您可能希望了解用户对产品质量的态度,而不考虑他们对价格的看法。您甚至可以要求使用自定义值来表示特定方面不适用。
以下示例展示了如何使用 AI.SCORE 函数,根据 customer_feedback 表中每条评价对价格、客户服务和质量的好感度,对用户情感进行 1 到 10 的评分。如果评价中未提及某个方面,该函数将返回自定义值 -1,以便您稍后将其滤除。
SELECT
review_id,
review_text,
AI.SCORE(
("Score 0.0 to 10 on positive sentiment about PRICE for review: ", review_text,
"If price is not mentioned, return -1.0"),
connection_id => "CONNECTION_ID") AS price_score,
AI.SCORE(
("Score 0.0 to 10 on positive sentiment about CUSTOMER SERVICE for review: ", review_text,
"If customer service is not mentioned, return -1.0"),
connection_id => "CONNECTION_ID") AS service_score,
AI.SCORE(
("Score 0.0 to 10 on positive sentiment about QUALITY for review: ", review_text,
"If quality is not mentioned, return -1.0"),
connection_id => "CONNECTION_ID") AS quality_score
FROM
my_dataset.customer_feedback
LIMIT 3;
结果类似于以下内容:
+-----------+------------------------------------------+--------------+---------------+---------------+ | review_id | review_text | price_score | service_score | quality_score | +-----------+------------------------------------------+--------------+---------------+---------------+ | 4 | "I'm so happy with this purchase! It | -1.0 | -1.0 | 9.5 | | | arrived early and exceeded all my | | | | | | expectations. The quality is top-notch, | | | | | | although the setup was a bit tricky." | | | | +-----------+------------------------------------------+--------------+---------------+---------------+ | 8 | "The shipping was delayed, but the | 9.0 | 8.5 | 9.0 | | | support team was very helpful and kept | | | | | | me informed. The product itself is | | | | | | great, especially for the price." | | | | +-----------+------------------------------------------+--------------+---------------+---------------+ | 6 | "Absolutely furious! The item arrived | -1.0 | 1.0 | 0.0 | | | broken, and getting a refund is proving | | | | | | impossible. I will never buy from them | | | | | | again." | | | | +-----------+------------------------------------------+--------------+---------------+---------------+
检测情绪
除了正面或负面情感之外,您还可以根据您选择的特定情绪对文本进行分类。如果您希望更好地了解用户反应,或标记带有强烈情绪的反馈以供审核,此功能会非常有用。
SELECT
review_id,
review_text,
AI.CLASSIFY(
review_text,
categories => ['joy', 'anger', 'sadness', 'surprise', 'fear', 'disgust', 'neutral', 'other'],
connection_id => "CONNECTION_ID"
) AS emotion
FROM
my_dataset.customer_feedback;
结果类似于以下内容:
+-----------+------------------------------------------+---------+ | review_id | review_text | emotion | +-----------+------------------------------------------+---------+ | 2 | "Customer support was a nightmare. It | anger | | | took three weeks for my order to | | | | arrive, and when it did, the box was | | | | damaged. Very frustrating!" | | +-----------+------------------------------------------+---------+ | 7 | This new feature for account access is | anger | | | confusing. I can't find where to update | | | | my profile. Please fix this bug! | | +-----------+------------------------------------------+---------+ | 4 | "I'm so happy with this purchase! It | joy | | | arrived early and exceeded all my | | | | expectations. The quality is top-notch, | | | | although the setup was a bit tricky." | | +-----------+------------------------------------------+---------+ | 1 | "The battery life is incredible, and | joy | | | the screen is gorgeous! Best phone I've | | | | ever had. Totally worth the price." | | +-----------+------------------------------------------+---------+ | 8 | "The shipping was delayed, but the | joy | | | support team was very helpful and kept | | | | me informed. The product itself is | | | | great, especially for the price." | | +-----------+------------------------------------------+---------+ | 5 | The price is a bit too high for what | sadness | | | you get. The material feels cheap and | | | | I'm worried it won't last. Service was | | | | okay. | | +-----------+------------------------------------------+---------+ | 3 | "The product does exactly what it says | neutral | | | on the box. No complaints, but not | | | | exciting either." | | +-----------+------------------------------------------+---------+ | 6 | "Absolutely furious! The item arrived | anger | | | broken, and getting a refund is proving | | | | impossible. I will never buy from them | | | | again." | | +-----------+------------------------------------------+---------+
按主题对评价进行分类
您可以使用 AI.CLASSIFY 函数将评价归类为预定义的主题。例如,您可以执行以下操作:
以下示例展示了如何将客户反馈分类为“结算问题”或“账号访问权限”等各种类型,然后统计属于每个类别的评价数量:
SELECT
AI.CLASSIFY(
review_text,
categories => ['Billing Issue', 'Account Access',
'Product Bug', 'Feature Request',
'Shipping Delay', 'Other'],
connection_id => "CONNECTION_ID") AS topic,
COUNT(*) AS number_of_reviews,
FROM
my_dataset.customer_feedback
GROUP BY topic
ORDER BY number_of_reviews DESC;
结果类似于以下内容:
+----------------+-------------------+ | topic | number_of_reviews | +----------------+-------------------+ | Other | 5 | | Shipping Delay | 2 | | Product Bug | 1 | +----------------+-------------------+
识别语义相似的评价
您可以使用 AI.SCORE 函数,通过要求其对含义相似度进行评分,来评估两段文本的语义相似程度。这有助于您执行以下任务:
以下查询会查找讨论产品设置难度的评价:
SELECT
review_id,
review_text,
AI.SCORE(
(
"""How similar is the review to the concept of 'difficulty in setting up the product'?
A higher score indicates more similarity. Review: """,
review_text),
connection_id => "CONNECTION_ID") AS setup_difficulty
FROM my_dataset.customer_feedback
ORDER BY setup_difficulty DESC
LIMIT 2;
结果类似于以下内容:
+-----------+------------------------------------------+------------------+ | review_id | review_text | setup_difficulty | +-----------+------------------------------------------+------------------+ | 4 | "I'm so happy with this purchase! It | 3 | | | arrived early and exceeded all my | | | | expectations. The quality is top-notch, | | | | although the setup was a bit tricky." | | +-----------+------------------------------------------+------------------+ | 7 | This new feature for account access is | 1 | | | confusing. I can't find where to update | | | | my profile. Please fix this bug! | | +-----------+------------------------------------------+------------------+
您还可以使用 AI.IF 函数查找与文本相关的评价:
SELECT
review_id,
review_text
FROM my_dataset.customer_feedback
WHERE
AI.IF(
(
"Does this review discuss difficulty setting up the product? Review: ",
review_text),
connection_id => "CONNECTION_ID");
组合函数
将这些函数组合在一个查询中可能会很有帮助。例如,以下查询首先筛选出负面情感的评价,然后按不满类型对其进行分类:
SELECT
review_id,
review_text,
AI.CLASSIFY(
review_text,
categories => [
'Poor Quality', 'Bad Customer Service', 'High Price', 'Other Negative'],
connection_id => "CONNECTION_ID") AS negative_topic
FROM my_dataset.customer_feedback
WHERE
AI.IF(
("Does this review express a negative sentiment? Review: ", review_text),
connection_id => "CONNECTION_ID");
创建可重用的提示 UDF
为了保持查询的可读性,您可以通过创建用户定义的函数来重用提示逻辑。以下查询创建了一个函数,通过使用自定义提示调用 AI.IF 来检测负面情感。然后,它会调用该函数来按负面评价进行过滤。
CREATE OR REPLACE FUNCTION my_dataset.is_negative_sentiment(review_text STRING)
RETURNS BOOL
AS (
AI.IF(
("Does this review express a negative sentiment? Review: ", review_text),
connection_id => "CONNECTION_ID")
);
SELECT
review_id,
review_text
FROM my_dataset.customer_feedback
WHERE my_dataset.is_negative_sentiment(review_text);
清理
为避免产生费用,您可以删除包含所创建资源的项目,也可以保留该项目但删除各个资源。
删除项目
如需删除项目,请执行以下操作:
删除数据集
如需删除数据集及其包含的所有资源(包括所有表和函数),请运行以下查询:
DROP SCHEMA my_dataset CASCADE;