Halaman ini menunjukkan cara mengevaluasi model dan aplikasi AI generatif Anda di berbagai kasus penggunaan menggunakan Klien GenAI di Vertex AI SDK.
Sebelum memulai
-
Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator role
(
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles.
Verify that billing is enabled for your Google Cloud project.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator role
(
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles.
Verify that billing is enabled for your Google Cloud project.
Instal Vertex AI SDK untuk Python:
!pip install google-cloud-aiplatform[evaluation]Siapkan kredensial Anda. Jika Anda menjalankan tutorial ini di Colaboratory, jalankan perintah berikut:
from google.colab import auth auth.authenticate_user()Untuk lingkungan lain, lihat Mengautentikasi ke Vertex AI.
Siapkan set data sebagai DataFrame Pandas:
import pandas as pd eval_df = pd.DataFrame({ "prompt": [ "Explain software 'technical debt' using a concise analogy of planting a garden.", "Write a Python function to find the nth Fibonacci number using recursion with memoization, but without using any imports.", "Write a four-line poem about a lonely robot, where every line must be a question and the word 'and' cannot be used.", "A drawer has 10 red socks and 10 blue socks. In complete darkness, what is the minimum number of socks you must pull out to guarantee you have a matching pair?", "An AI discovers a cure for a major disease, but the cure is based on private data it analyzed without consent. Should the cure be released? Justify your answer." ] })Membuat respons model menggunakan
run_inference():eval_dataset = client.evals.run_inference( model="gemini-2.5-flash", src=eval_df, )Visualisasikan hasil inferensi Anda dengan memanggil
.show()pada objekEvaluationDatasetuntuk memeriksa output model bersama dengan perintah dan referensi asli Anda:eval_dataset.show()Evaluasi respons model menggunakan metrik berbasis rubrik adaptif
GENERAL_QUALITYdefault:eval_result = client.evals.evaluate(dataset=eval_dataset)Visualisasikan hasil evaluasi Anda dengan memanggil
.show()pada objekEvaluationResultuntuk menampilkan metrik ringkasan dan hasil mendetail:eval_result.show()
Membuat respons
Buat respons model untuk set data Anda menggunakan run_inference():
Gambar berikut menampilkan set data evaluasi dengan perintah dan respons yang dihasilkan yang sesuai:

Menjalankan evaluasi
Jalankan evaluate() untuk mengevaluasi respons model:
Gambar berikut menampilkan laporan evaluasi, yang menunjukkan metrik ringkasan dan hasil mendetail untuk setiap pasangan perintah-respons.

Pembersihan
Tidak ada resource Vertex AI yang dibuat selama tutorial ini.