Enable Agent Sandbox on GKE

This document explains how to enable the Agent Sandbox feature in a Google Kubernetes Engine (GKE) cluster. It also explains how to create a sandboxed environment on the cluster to safely execute untrusted code.

Agent Sandbox provides a secure and isolated environment for executing untrusted code, such as code generated by large language models (LLMs). Running this type of code directly in a cluster poses security risks, because untrusted code could potentially access or interfere with other apps or the underlying cluster node itself.

Agent Sandbox mitigates these risks by providing strong process, storage, and network isolation for the code it runs. This isolation is achieved using gVisor, a technology that creates a secure barrier between the application and the cluster node's operating system. Other sandboxing technologies, for example Kata containers, can be used instead; however, the example in this document uses gVisor only.

This document provides instructions for running Agent Sandbox on either a GKE Autopilot cluster or Standard cluster.

Costs

Following the steps in this document incurs charges on your Google Cloud account. Costs begin when you create a GKE cluster. These costs include per-cluster charges for GKE, as outlined on the Pricing page, and charges for running Compute Engine VMs.

To avoid unnecessary charges, ensure that you disable GKE or delete the project after you have completed this document.

Before you begin

  1. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

    Go to project selector

  2. Verify that billing is enabled for your Google Cloud project.

  3. Enable the Artifact Registry, Google Kubernetes Engine APIs.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

    Enable the APIs

  4. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

  5. Ensure that your cluster is running GKE version 1.35.2-gke.1210000 or later.

Define environment variables

To simplify the commands that you run in this document, you can set environment variables in Cloud Shell. In Cloud Shell, define the following useful environment variables by running the following commands:

export PROJECT_ID=$(gcloud config get project)
export CLUSTER_NAME="agent-sandbox-cluster"
export REGION="us-central1"
export CLUSTER_VERSION="1.35.2-gke.1210000"
export NODE_POOL_NAME="agent-sandbox-pool"
export MACHINE_TYPE="e2-standard-2"

Here's an explanation of these environment variables:

  • PROJECT_ID: the ID of your current Google Cloud project. Defining this variable helps ensure that all resources, like your GKE cluster, are created in the correct project.
  • CLUSTER_NAME: the name of your GKE cluster—for example, agent-sandbox-cluster.
  • REGION: the Google Cloud region where your GKE cluster will be created—for example, us-central1.
  • CLUSTER_VERSION: the version of GKE your cluster will run. The Agent Sandbox feature requires version 1.35.2-gke.1210000 or later.
  • NODE_POOL_NAME: the name of the node pool that will run sandboxed workloads—for example, agent-sandbox-pool. This variable is only required if you are creating a GKE Standard cluster.
  • MACHINE_TYPE: the machine type of the nodes in your node pool—for example, e2-standard-2. For details about different machine series and choosing between different options, see the Machine families resource and comparison guide. This variable is only required if you are creating a GKE Standard cluster.

Enable Agent Sandbox

You can enable the Agent Sandbox feature when you create a new cluster, or when updating an existing cluster.

Enable Agent Sandbox when creating a new GKE Autopilot cluster

To create a new GKE Autopilot cluster with Agent Sandbox enabled, include the --enable-agent-sandbox flag:

gcloud beta container clusters create-auto ${CLUSTER_NAME} \
    --region=${REGION} \
    --cluster-version=${CLUSTER_VERSION} \
    --enable-agent-sandbox

Enable Agent Sandbox when creating a new GKE Standard cluster

To create a new GKE Standard cluster with Agent Sandbox enabled, you must create the cluster, add a node pool with gVisor enabled, and then enable the Agent Sandbox feature:

  1. Create the cluster:

    gcloud beta container clusters create ${CLUSTER_NAME} \
        --region=${REGION} \
        --cluster-version=${CLUSTER_VERSION}
    
  2. Create a separate node pool with gVisor enabled:

    gcloud container node-pools create ${NODE_POOL_NAME} \
        --cluster=${CLUSTER_NAME} \
        --machine-type=${MACHINE_TYPE} \
        --region=${REGION} \
        --image-type=cos_containerd \
        --sandbox=type=gvisor
    
  3. Update the cluster to enable the Agent Sandbox feature:

    gcloud beta container clusters update ${CLUSTER_NAME} \
        --region=${REGION} \
        --enable-agent-sandbox
    

Enable Agent Sandbox when updating an existing GKE cluster

To enable Agent Sandbox on an existing cluster, the cluster must be running version 1.35.2-gke.1210000 or later:

  1. If you are using a GKE Standard cluster, Agent Sandbox relies on gVisor. If your Standard cluster doesn't have a gVisor-enabled node pool, you must create one first:

    gcloud container node-pools create ${NODE_POOL_NAME} \
        --cluster=${CLUSTER_NAME} \
        --machine-type=${MACHINE_TYPE} \
        --region=${REGION} \
        --image-type=cos_containerd \
        --sandbox=type=gvisor
    
  2. Update the cluster to enable the Agent Sandbox feature:

    gcloud beta container clusters update ${CLUSTER_NAME} \
        --region=${REGION} \
        --enable-agent-sandbox
    

Verify the configuration

You can verify whether the Agent Sandbox feature is enabled by inspecting the cluster description:

gcloud beta container clusters describe ${CLUSTER_NAME} \
    --region=${REGION} \
    --format="value(addonsConfig.agentSandboxConfig.enabled)"

If the feature is successfully enabled, the command returns True.

Deploy a sandboxed environment

We recommend deploying a sandboxed environment by defining a SandboxTemplate and keeping pre-warmed instances ready using a SandboxWarmPool. You can then request an instance from this warm node pool using a SandboxClaim. Alternatively, you can create a Sandbox directly, but this approach doesn't support warm pools.

SandboxTemplate, SandboxWarmPool, SandboxClaim, and Sandbox are Kubernetes custom resources.

The SandboxTemplate acts as a reusable blueprint. The SandboxWarmPool helps ensure that a specified number of pre-warmed Pods are always running and ready to be claimed. Use of this customer resource minimizes startup latency.

To deploy a sandboxed environment by creating the SandboxTemplate and SandboxWarmPool, complete the following steps:

  1. In Cloud Shell, create a file named sandbox-template.yaml with the following content:

    apiVersion: extensions.agents.x-k8s.io/v1alpha1
    kind: SandboxTemplate
    metadata:
      name: python-runtime-template
      namespace: default
    spec:
      podTemplate:
        metadata:
          labels:
            sandbox-type: python-runtime
        spec:
          runtimeClassName: gvisor # Enforce gVisor sandbox
          containers:
          - name: runtime
            image: registry.k8s.io/agent-sandbox/python-runtime-sandbox:v0.1.0
            ports:
            - containerPort: 8888
            resources:
              requests:
                cpu: "250m"
                memory: "512Mi"
              limits:
                cpu: "500m"
                memory: "1Gi"
          restartPolicy: OnFailure
    
  2. Apply the SandboxTemplate manifest:

    kubectl apply -f sandbox-template.yaml
    
  3. Create a file named sandbox-warmpool.yaml with the following content:

    apiVersion: extensions.agents.x-k8s.io/v1alpha1
    kind: SandboxWarmPool
    metadata:
      name: python-runtime-warmpool
      namespace: default
      labels:
        app: python-runtime-warmpool
    spec:
      replicas: 2
      sandboxTemplateRef:
        # This must match the name of the SandboxTemplate.
        name: python-runtime-template
    
  4. Apply the SandboxWarmPool manifest:

    kubectl apply -f sandbox-warmpool.yaml
    

Create a SandboxClaim

The SandboxClaim requests a sandbox from the template. Because you created a warm pool, the created Sandbox adopts a running Pod from the pool instead of starting a fresh Pod.

To request a sandbox from the template by creating a SandboxClaim, complete the following steps:

  1. Create a file named sandbox-claim.yaml with the following content:

    apiVersion: extensions.agents.x-k8s.io/v1alpha1
    kind: SandboxClaim
    metadata:
      name: sandbox-claim
      namespace: default
    spec:
      sandboxTemplateRef:
        # This must match the name of the SandboxTemplate.
        name: python-runtime-template
    
  2. Apply the SandboxClaim manifest:

    kubectl apply -f sandbox-claim.yaml
    
  3. Verify that the sandbox, claim, and warm pool are ready:

    kubectl get sandboxwarmpool,sandboxclaim,sandbox,pod
    

Alternative: Create a Sandbox directly

If you don't need the fast startup times provided by warm pools, you can deploy a Sandbox directly without using templates.

To deploy a sandboxed environment by creating a Sandbox directly, complete the following steps:

  1. Create a file named sandbox.yaml with the following content:

    apiVersion: agents.x-k8s.io/v1alpha1
    kind: Sandbox
    metadata:
      name: sandbox-example-2
    spec:
      replicas: 1
      podTemplate:
        metadata:
          labels:
            sandbox: sandbox-example
        spec:
          runtimeClassName: gvisor
          restartPolicy: Always
          containers:
          - name: my-container
            image: busybox
            command: ["/bin/sh", "-c"]
            args: ["sleep 3600000; echo 'Container finished successfully'; exit 0"]
    
  2. Apply the Sandbox manifest:

    kubectl apply -f sandbox.yaml
    
  3. Verify that the sandbox is running:

    kubectl get sandbox
    

Disable Agent Sandbox

To disable the Agent Sandbox feature, use the gcloud beta container clusters update command with the --no-enable-agent-sandbox flag:

gcloud beta container clusters update ${CLUSTER_NAME} \
    --region=${REGION} \
    --no-enable-agent-sandbox

Clean up resources

To avoid incurring charges to your Google Cloud account, delete the GKE cluster that you created:

gcloud container clusters delete $CLUSTER_NAME \
    --region=${REGION} \
    --quiet

What's next