This document describes how to view content security insights from Model Armor for supported AI agents.
Model Armor screens the requests and responses for security risks, such as indirect prompt injection attacks, sensitive data leakage, and the generation or serving of harmful content. For more information, see Model Armor.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
Enable the Model Armor API.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission. Learn how to grant roles.Enable the Model Armor API.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission. Learn how to grant roles.- Configure Model Armor on one or more gateways in your project.
- To monitor agents that communicate with a Google Cloud MCP server, configure Model Armor with MCP servers.
- Set up tracing for your agent.
Required role
To get the permissions that
you need to monitor content security violations,
ask your administrator to grant you the
Monitoring Viewer (roles/monitoring.viewer)
IAM role on the project.
For more information about granting roles, see Manage access to projects, folders, and organizations.
This predefined role contains the permissions required to monitor content security violations. To see the exact permissions that are required, expand the Required permissions section:
Required permissions
The following permissions are required to monitor content security violations:
-
monitoring.monitoredResourceDescriptors.list -
monitoring.metricDescriptors.list
You might also be able to get these permissions with custom roles or other predefined roles.
Supported agents
The Security tab is populated with Model Armor insights for the following agents only:
- Agents deployed in Agent Runtime and governed by Agent Gateway.
- Agents deployed in Agent Runtime and communicating with a Google Cloud MCP server.
View the Model Armor insights for an AI agent
To view the content security insights for supported agents, follow these steps:
- In the Google Cloud console, go to Agent Registry.
- Select your project.
- Click the name of the agent.
- Click the Security tab.
View the number of flagged or blocked interactions
On the Security tab, view the number of interactions, including flagged and blocked interactions. The Security tab displays the following metrics:
- Total interactions: The total number of prompts and responses that are analyzed by Model Armor.
- Interactions flagged: The number of interactions that violated a configured policy in Model Armor templates.
- Interactions blocked: The number of interactions blocked if you
configured Model Armor in the
INSPECT_AND_BLOCKmode.
Monitor content security violations
In the Violations over time chart, monitor the number of detected violations over time.
The violations detected are categorized into the following areas:
- All detectors: The total number of violations detected by all detectors, including prompt injections and jailbreaks, malicious URLs, responsible AI, and sensitive data.
- Responsible AI: Content violations detected by safety filters, such as harassment and hate speech. For a complete list of responsible AI categories, see Responsible AI safety filter.
- Sensitive data: Content violations involving the presence of sensitive information types or custom information types that you define. For more information, see Sensitive Data Protection.
For more information about these detectors, see Model Armor filters.
Download violations data to a PNG or CSV file
To download violations data to a PNG or CSV file, follow these steps:
- In the Violations over time view on the Security tab, select the period for which you want to download data.
- Click More chart options > Download.
- Click Download PNG or Download CSV to download the data in your preferred format.