Skip to content

Sandbox

Agents are most effective when they can perform real actions rather than being constrained to text-only reasoning: executing shell commands, running code, and reading and writing files. But giving an agent unrestricted access to your host machine, especially the ability to run model-generated code, is a serious security risk.

A Sandbox gives the agent an execution environment for these operations while keeping the agent’s core process (model calls, hooks, state) decoupled. Now, the agent can run shell commands, write and execute code, and access a filesystem without compromising the host or its own runtime.

There are two ways to think about sandboxed agents: running the entire agent inside a sandbox (a deployment concern, not an SDK feature), or giving the agent a sandbox to use for execution while the agent itself stays in your trusted infrastructure.

Strands implements the second pattern. The agent process runs on your host and handles model calls, tools, hooks, and state. The sandbox is a pluggable backend that receives only execution operations via a standard interface.

flowchart LR
subgraph runtime["Agent runtime (your infrastructure)"]
L[Agent loop]
M[Model]
T["sandbox_bash /<br/>sandbox_file_editor"]
L --> M --> T
end
subgraph sb["Sandbox (execution environment you provision)"]
E[Command / file operation]
end
T -->|"execute, read_file, write_file"| E
E -->|"stdout, stderr, exit code"| T

All Sandbox implementations share the same abstract interface:

MethodDescription
execute_streaming executeStreaming Run a shell command, stream output
execute_code_streaming executeCodeStreaming Run code via an interpreter, stream output
read_file readFile Read a file as bytes
write_file writeFile Write bytes to a file
remove_file removeFile Delete a file
list_files listFiles List directory contents

The execution methods accept optional parameters: timeout (seconds, throws SandboxTimeoutError when exceeded), cwd (working directory override), env (environment variables), and in TypeScript, signal (AbortSignal, throws SandboxAbortError).

Pass a sandbox to an agent through the sandbox parameter, and the agent’s commands and file operations will execute inside it.

import { Agent } from '@strands-agents/sdk'
import { DockerSandbox } from '@strands-agents/sdk/sandbox/docker'
const agent = new Agent({
sandbox: new DockerSandbox({ container: 'my-container-id' }),
})
// The agent's sandbox_bash and sandbox_file_editor tools execute inside the container
await agent.invoke('List all files inside the current directory')

In TypeScript, pass sandbox: false to opt out explicitly and keep that intent stable if the default changes:

// Explicit opt-out: no sandbox, run on host
const agent = new Agent({ sandbox: false })

See Available Sandboxes for Docker and SSH configuration options, programmatic access, and streaming output.

When a sandbox is configured, the agent automatically registers two tools so the model can operate in the sandboxed environment without additional setup:

  • sandbox_bash — Executes shell commands. Each call runs in a fresh shell; state such as variables and the working directory does not persist across calls.
  • sandbox_file_editor — Views, creates, and edits files using absolute paths. Supports view (with line ranges), create, string replace, and insert operations.

If a tool with the same name is already registered on the agent, the sandbox-vended version is skipped. This lets you override a vended tool with a stricter variant:

import { Agent } from '@strands-agents/sdk'
import { DockerSandbox } from '@strands-agents/sdk/sandbox/docker'
import { makeBash } from '@strands-agents/sdk/vended-tools/bash'
const sandbox = new DockerSandbox({ container: 'agent-workspace' })
const lockedBash = makeBash(sandbox, {
name: 'sandbox_bash',
description: 'Run read-only shell commands. Do not modify files.',
})
// The agent keeps lockedBash; the sandbox's own sandbox_bash is skipped
const agent = new Agent({ sandbox, tools: [lockedBash] })

Custom sandbox implementations can also override get_tools() getTools() to vend their own tools entirely.

You can add your own tools alongside the auto-vended ones by creating a tool that reads the sandbox from its context and passing it in the agent’s tools array:

import { Agent, tool } from '@strands-agents/sdk'
import { DockerSandbox } from '@strands-agents/sdk/sandbox/docker'
import { z } from 'zod'
const lint = tool({
name: 'lint',
description: 'Lint a file and return structured errors',
inputSchema: z.object({
path: z.string().describe('File path to lint'),
}),
callback: async (input, context) => {
const result = await context!.agent.sandbox.execute(
`eslint --format json ${input.path}`
)
const issues = JSON.parse(result.stdout)
return issues.flatMap((f: any) => f.messages)
},
})
const agent = new Agent({
sandbox: new DockerSandbox({ container: 'my-dev-env' }),
tools: [lint],
})
// Agent now has: sandbox_bash, sandbox_file_editor (vended) + lint (yours)

Your custom tools coexist with the auto-vended ones. The sandbox routes all execution to the same environment regardless of which tool initiated it.

The following vended plugins route their file I/O through the agent’s sandbox when one is configured:

  • Agent Skills — Skill files loaded from filesystem paths are read through the agent’s sandbox. Skills stored inside a container or on a remote host are accessible without copying them to the host. URL and inline skill sources are sandbox-independent.
  • Context Offloader — When using FileStorage as the storage backend, offloaded artifacts are written to and read from the sandbox’s filesystem rather than the host. The plugin binds to the agent’s sandbox during initialization; no explicit wiring is needed.