了解如何使用 Google Google Cloud 控制台开始使用 Gen AI Evaluation Service。
准备工作
- 登录您的 Google Cloud 账号。如果您是 Google Cloud新手, 请创建一个账号来评估我们的产品在 实际场景中的表现。新客户还可获享 $300 赠金,用于 运行、测试和部署工作负载。
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator role
(
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles.
-
Verify that billing is enabled for your Google Cloud project.
-
Make sure that you have the following role or roles on the project: Storage Admin
Check for the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
-
In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.
- For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.
Grant the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
- Click Grant access.
-
In the New principals field, enter your user identifier. This is typically the email address for a Google Account.
- Click Select a role, then search for the role.
- To grant additional roles, click Add another role and add each additional role.
- Click Save.
-
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator role
(
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles.
-
Verify that billing is enabled for your Google Cloud project.
-
Make sure that you have the following role or roles on the project: Storage Admin
Check for the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
-
In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.
- For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.
Grant the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
- Click Grant access.
-
In the New principals field, enter your user identifier. This is typically the email address for a Google Account.
- Click Select a role, then search for the role.
- To grant additional roles, click Add another role and add each additional role.
- Click Save.
-
评估模型
如需评估模型,请执行以下操作:
在 Google Cloud 控制台中,前往 Gen AI Evaluation 页面。
点击新评估 以打开评估页面。
选择一个来源以加载数据集进行评估:
如需上传本地 CSV 或 JSONL 文件,请选择上传文件 。数据集必须包含提示或要在提示模板中使用的记录,并且可以选择性地包含模型回答。最多 200 行。
如需根据提示模板生成提示,请选择生成数据 。 Gen AI Evaluation Service 会在您创建数据集时生成并填充在提示模板中定义的变量。如需详细了解如何编写提示模板,请参阅使用提示 模板。
在提示模板 字段中输入包含变量的提示模板。
如需为每个变量添加说明或指定要生成的样本数量,请展开定义变量和样本大小 。
点击生成数据集 以生成提示。
根据提示生成并评估回答:
在评估候选对象 部分中,点击添加评估 候选对象 ,或者,如果候选对象已存在,请点击 修改 以定义要评估的提示和 回答。例如,您可以指定来自上传文件或生成数据的提示或回答。
如需比较多个候选对象,请点击添加比较候选对象 。
在指标 部分中,至少添加一个指标来对候选对象回答的质量进行评分。如需详细了解指标类型,请参阅 Gen AI Evaluation Service 概览页面上的 评估指标 部分。
对于某些自适应评分标准,您可以通过展开高级 并提供自定义说明(例如
Evaluate the dataset on cultural sensitivity)来控制根据每个提示生成的评分标准。- 在名称和存储配置 部分中,为评估指定名称,并指定用于存储评估结果的 Cloud Storage 存储桶。
点击评估 。
查看评估结果
如需查看评估结果,请执行以下操作:
在 Google Cloud 控制台中,前往 GenAI Evaluation 页面。
点击评估名称。
对于评估数据集中的每个提示,系统都会显示回答以及评估结果。
评估合作伙伴模型
您可以使用 Gen AI Evaluation Service 评估以下合作伙伴模型:
- Anthropic
- Llama
合作伙伴模型通过 Vertex AI Model Garden 提供支持。您必须先在 Model Garden 中启用合作伙伴模型,然后才能选择该模型进行评估。如需评估合作伙伴模型,请在评估设置期间在模型选择菜单中选择该模型。
价格
评估第三方模型的价格取决于在 Vertex AI Model Garden 中进行模型推理时产生的任何费用。请参阅 Vertex AI 上的生成式 AI 的价格页面。