Skip to content

Building a Custom Sandbox

Looking for something beyond the built-in implementations? Build a custom sandbox when your execution environment is not a local Docker container or an SSH host: a microVM, a cloud code-execution API, or a managed runtime. The agent loop, model, and vended tools all stay the same; you only implement the methods that run commands and access files in your backend.

PosixShellSandbox is a base class that reduces the implementation burden to a single method. If you can implement execute_streaming executeStreaming (run a shell command via your backend and stream the output), you get everything else for free:

  • Code execution via base64-encoded heredoc piped to the interpreter
  • File read/write via base64 encoding over the shell
  • Directory listing via ls

Both DockerSandbox and SshSandbox extend PosixShellSandbox.

import { spawn } from 'node:child_process'
import { PosixShellSandbox } from '@strands-agents/sdk/sandbox'
import type { ExecuteOptions, StreamChunk, ExecutionResult } from '@strands-agents/sdk/sandbox'
class FirecrackerSandbox extends PosixShellSandbox {
constructor(private readonly vmId: string) {
super()
}
async *executeStreaming(
command: string,
options?: ExecuteOptions
): AsyncGenerator<StreamChunk | ExecutionResult, void, undefined> {
const proc = spawn('fc-exec', [this.vmId, 'sh', '-c', command])
let stdout = ''
let stderr = ''
for await (const data of proc.stdout) {
const text = data.toString()
stdout += text
yield { type: 'streamChunk', data: text, streamType: 'stdout' }
}
for await (const data of proc.stderr) {
const text = data.toString()
stderr += text
yield { type: 'streamChunk', data: text, streamType: 'stderr' }
}
const exitCode: number = await new Promise((resolve) =>
proc.on('close', (code) => resolve(code ?? 0))
)
yield { type: 'executionResult', exitCode, stdout, stderr, outputFiles: [] }
}

To give your sandbox the same sandbox_bash and sandbox_file_editor tools the built-in sandboxes provide, override getTools() / get_tools() and return tools bound to it:

import type { Tool } from '@strands-agents/sdk'
import { makeBash } from '@strands-agents/sdk/vended-tools/bash'
import { makeFileEditor } from '@strands-agents/sdk/vended-tools/file-editor'
override getTools(): Tool[] {
return [
makeFileEditor(this, { name: 'sandbox_file_editor' }),
makeBash(this, { name: 'sandbox_bash' }),
]
}

For environments where you have native API access (no shell), extend Sandbox directly and implement all six abstract methods: execute_streaming executeStreaming , execute_code_streaming executeCodeStreaming , read_file readFile , write_file writeFile , remove_file removeFile , and list_files listFiles .

Prefer the shell base whenever your backend can run sh -c. Reach for the raw interface only when shaping every operation as a shell command would be a worse fit than calling your backend’s native API.

A custom sandbox is a boundary only when the environment behind it is isolated. The interface routes operations; it does not confine them. Whatever the agent can reach through your execute_streaming implementation (the method that runs commands in your environment), it can reach.

A container running as root with the host filesystem mounted is not a boundary, even though it uses the same Sandbox interface as a locked-down container. The security comes from the environment you provision, not from the interface itself. Scope the environment to the least privilege the task needs, and treat that configuration as the actual control.