Agent Platform Computer Use sandboxes provide a secure, isolated browser environment that your agents can interact with. These sandboxes allow agents to automate tasks that mimic human interactions (such as clicking, navigating sites, and taking screenshots).
How it works
When you create a Computer Use sandbox, Gemini Enterprise Agent Platform provisions a containerized environment that runs a web browser agent. You can control the browser in two ways:
- API requests: Send commands to the sandbox to perform actions like navigating to a URL, clicking on elements, or typing text.
- Browser control: Connect to the browser by using a standard Chrome DevTools Protocol (CDP) connection, letting you use browser automation tools (such as Playwright) to automate the browser.
Considerations
During Preview, Agent Platform Computer Use Sandbox latency is optimized for low traffic volumes. Higher traffic volumes might temporarily encounter elevated latency.
Control the browser using API
You can send API requests to the sandbox to perform common browser actions. The sandbox handles the execution of these actions within its isolated environment.
Supported actions include:
- Navigating to a URL.
- Clicking at specific coordinates.
- Typing text into fields.
- Taking screenshots.
For an example of how to send commands, see the Computer Use quickstart.
Control the browser using a CDP connection
For more advanced automation, you can connect to the sandbox browser over a Chrome DevTools Protocol (CDP) connection. This method lets you use standard browser automation tools, such as Playwright, to interact with the webpage.
To connect Playwright to the sandbox:
- Generate the WebSocket URL and required headers for your sandbox by using
the Python SDK
generate_browser_ws_headersmethod.
service_account_email = "SERVICE_ACCOUNT_EMAIL"
ws_url, ws_headers = client.agent_engines.sandboxes.generate_browser_ws_headers(
sandbox_environment=sandbox,
service_account_email=service_account_email,
)
- Use Playwright's
connect_over_cdpmethod to establish a connection.
Use the generated WebSocket URL and headers to connect over CDP using Playwright:
import asyncio
from playwright.async_api import async_playwright
import nest_asyncio
nest_asyncio.apply()
async def connect_over_cdp(ws_url, ws_headers):
async with async_playwright() as p:
try:
browser = await p.chromium.connect_over_cdp(
endpoint_url=ws_url,
headers=ws_headers
)
print("Successfully connected to browser over CDP.")
# You can now interact with the browser
page = browser.contexts[0].pages[0]
await page.goto("https://www.example.com")
print(f"Page title: {await page.title()}")
await browser.close()
print("Browser connection closed.")
except Exception as e:
print(f"An error occurred: {e}")
# Run CDP connection
asyncio.run(connect_over_cdp(ws_url, ws_headers))
Live streaming view
Computer Use sandboxes support a live streaming view (VNC), letting you to visually monitor the agent's actions in real-time. You can debug and observe the agent's behavior.
What's next
- Computer Use quickstart
- Explore Snapshots for sandbox lifecycle management.