Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

Valuta gli agenti di AI generativa utilizzando il client GenAI nell'SDK Agent Platform

Dopo aver creato e valutato il modello di AI generativa, potresti utilizzarlo per creare un agente, ad esempio un chatbot. Le valutazioni di Gen AI ti consentono di misurare la capacità del tuo agente di completare le attività e raggiungere gli obiettivi per il tuo caso d'uso.

Questa pagina mostra come creare ed eseguire il deployment di un agente di base e utilizzare le valutazioni di Gen AI per valutare l'agente:

Sviluppa un agente: definisci un agente con funzioni di strumenti di base.
Esegui il deployment di un agente: esegui il deployment dell'agente nel runtime di Agent Platform.
Esegui l'inferenza dell'agente: definisci un set di dati di valutazione e esegui l'inferenza dell'agente per generare risposte.
Crea un'esecuzione di valutazione: crea un'esecuzione di valutazione per eseguire la valutazione.
Visualizza i risultati della valutazione: visualizza i risultati della valutazione tramite l'esecuzione della valutazione.

Prima di iniziare

Accedi al tuo Google Cloud account. Se non conosci Google Cloud, crea un account per valutare le prestazioni dei nostri prodotti in scenari reali. I nuovi clienti ricevono anche 300 $di crediti senza costi per l'esecuzione, il test e il deployment dei carichi di lavoro.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

Go to project selector

Verify that billing is enabled for your Google Cloud project.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Installa l'SDK Agent Platform per Python:

%pip install google-cloud-aiplatform[adk,agent_engines]
%pip install --upgrade --force-reinstall -q google-cloud-aiplatform[evaluation]

Configura le tue credenziali. Se stai eseguendo questo tutorial in Colaboratory, esegui quanto segue:
```
from google.colab import auth
auth.authenticate_user()
```
Per altri ambienti, consulta Autenticazione ad Agent Platform.

Inizializza il client GenAI nell'SDK Agent Platform:

import vertexai
from vertexai import Client
from google.genai import types as genai_types

GCS_DEST = "gs://BUCKET_NAME/output-path"
vertexai.init(
    project=PROJECT_ID,
    location=LOCATION,
)

client = Client(
    project=PROJECT_ID,
    location=LOCATION,
    http_options=genai_types.HttpOptions(api_version="v1beta1"),
  )

Sostituisci quanto segue:

BUCKET_NAME: nome del bucket Cloud Storage. Per saperne di più sulla creazione di bucket, consulta Create a bucket.
PROJECT_ID: il tuo ID progetto.
LOCATION: la regione selezionata.

Sviluppa un agente

Sviluppa un agente Agent Development Kit (ADK) definendo il modello, le istruzioni e l'insieme di strumenti. Per saperne di più sullo sviluppo di un agente, consulta Sviluppare un agente Agent Development Kit.

from google.adk import Agent

# Define Agent Tools
def search_products(query: str):
    """Searches for products based on a query."""
    # Mock response for demonstration
    if "headphones" in query.lower():
        return {"products": [{"name": "Wireless Headphones", "id": "B08H8H8H8H"}]}
    else:
        return {"products": []}

def get_product_details(product_id: str):
    """Gets the details for a given product ID."""
    if product_id == "B08H8H8H8H":
        return {"details": "Noise-cancelling, 20-hour battery life."}
    else:
        return {"error": "Product not found."}

def add_to_cart(product_id: str, quantity: int):
    """Adds a specified quantity of a product to the cart."""
    return {"status": f"Added {quantity} of {product_id} to cart."}

# Define Agent
my_agent = Agent(
    model="gemini-2.5-flash",
    name='ecommerce_agent',
    instruction='You are an ecommerce expert',
    tools=[search_products, get_product_details, add_to_cart],
)

Esegui il deployment dell'agente

Esegui il deployment dell'agente nel runtime di Agent Platform. L'operazione potrebbe richiedere fino a 10 minuti. Recupera il nome della risorsa dall'agente di cui è stato eseguito il deployment.

def deploy_adk_agent(root_agent):
  """Deploy agent to agent engine.
  Args:
    root_agent: The ADK agent to deploy.
  """
  app = vertexai.agent_engines.AdkApp(
      agent=root_agent,
  )
  remote_app = client.agent_engines.create(
      agent=app,
      config = {
          "staging_bucket": gs://BUCKET_NAME,
          "requirements": ['google-cloud-aiplatform[adk,agent_engines]'],
          "env_vars": {"GOOGLE_CLOUD_AGENT_ENGINE_ENABLE_TELEMETRY": "true"}
      }
  )
  return remote_app

agent_engine = deploy_adk_agent(my_agent)
agent_engine_resource_name = agent_engine.api_resource.name

Per visualizzare l'elenco degli agenti di cui è stato eseguito il deployment su Agent Platform, consulta Gestire gli agenti di cui è stato eseguito il deployment.

Genera risposte

Genera le risposte del modello per il set di dati utilizzando run_inference():

Prepara il set di dati come DataFrame Pandas. I prompt devono essere specifici per il tuo agente. Gli input della sessione sono obbligatori per le tracce. Per saperne di più, consulta Sessione: monitoraggio delle singole conversazioni.

import pandas as pd
from vertexai import types

session_inputs = types.evals.SessionInput(
    user_id="user_123",
    state={},
)
agent_prompts = [
    "Search for 'noise-cancelling headphones'.",
    "Show me the details for product 'B08H8H8H8H'.",
    "Add one pair of 'B08H8H8H8H' to my shopping cart.",
    "Find 'wireless earbuds' and then add the first result to my cart.",
    "I need a new laptop for work, can you find one with at least 16GB of RAM?",
]
agent_dataset = pd.DataFrame({
    "prompt": agent_prompts,
    "session_inputs": [session_inputs] * len(agent_prompts),
})

Genera le risposte del modello utilizzando run_inference():

agent_dataset_with_inference = client.evals.run_inference(
    agent=agent_engine_resource_name,
    src=agent_dataset,
)

Visualizza i risultati dell'inferenza chiamando .show() sull'oggetto EvaluationDataset per esaminare gli output del modello insieme ai prompt e ai riferimenti originali:
```
agent_dataset_with_inference.show()
```

Esegui la valutazione dell'agente

Esegui create_evaluation_run() per valutare le risposte dell'agente.

Recupera agent_info utilizzando la funzione helper integrata:

agent_info = types.evals.AgentInfo.load_from_agent(
    my_agent,
    agent_engine_resource_name
)

Valuta le risposte del modello utilizzando metriche basate su rubriche adattive specifiche per l'agente (FINAL_RESPONSE_QUALITY, TOOL_USE_QUALITY, e HALLUCINATION):

evaluation_run = client.evals.create_evaluation_run(
    dataset=agent_dataset_with_inference,
    agent_info=agent_info,
    metrics=[
        types.RubricMetric.FINAL_RESPONSE_QUALITY,
        types.RubricMetric.TOOL_USE_QUALITY,
        types.RubricMetric.HALLUCINATION,
        types.RubricMetric.SAFETY,
    ],
    dest=GCS_DEST,
)

Visualizza i risultati della valutazione dell'agente

Puoi visualizzare i risultati della valutazione utilizzando l'SDK Agent Platform.

Recupera l'esecuzione della valutazione e visualizza la valutazione risultati chiamando .show() per visualizzare metriche di riepilogo e risultati dettagliati:

evaluation_run = client.evals.get_evaluation_run(
    name=evaluation_run.name,
    include_evaluation_items=True
)

evaluation_run.show()

I risultati dettagliati includono anche le tracce che mostrano le interazioni dell'agente. Per saperne di più sulle tracce, consulta Tracciare un agente.

Passaggi successivi

Prova i seguenti notebook di valutazione dell'agente: