Skip to content

Robots Sim

Strands Robots Sim is a Python library for controlling robots in simulated environments with natural language through Strands Agents. It lets you develop and test robot control strategies without physical hardware, using the same policy abstraction as Strands Robots.

The library provides two execution modes as Strands agent tools: SimEnv for full episode execution where the agent specifies a task and the policy runs to completion, and SteppedSimEnv for iterative control where the agent observes camera feedback after each batch of steps and adapts its instructions accordingly. This enables a dual-system pattern where the agent handles high-level reasoning and planning while a VLA policy handles low-level motor control.

Getting started

Installation

pip install strands-robots-sim

# For simulation environment dependencies (e.g. Libero)
pip install strands-robots-sim[sim]

Basic usage

from strands import Agent
from strands_robots_sim import SimEnv, gr00t_inference

sim_env = SimEnv(
    tool_name="my_sim",
    env_type="libero",
    task_suite="libero_10",
    data_config="libero_10",
)

agent = Agent(tools=[sim_env, gr00t_inference])

# Start inference service
agent.tool.gr00t_inference(
    action="start",
    checkpoint_path="/data/checkpoints/model",
    port=8000,
    data_config="examples.Libero.custom_data_config:LiberoDataConfig",
)

# Run a task
agent("Run the task 'pick up the red block' for 5 episodes with video recording")

How it works

graph TD
    A[Natural Language<br/>'Pick up the red block'] --> B[Strands Agent]
    B --> C[SimEnv / SteppedSimEnv]
    C --> D[Policy Provider]
    C --> G[Simulation Environment]
    D --> F[Action Chunk]
    F --> G
    G -.->|Observation| C
    G -.->|Visual Feedback + State<br/>SteppedSimEnv only| B

    classDef input fill:#2ea44f,stroke:#1b7735,color:#fff
    classDef agent fill:#0969da,stroke:#044289,color:#fff
    classDef policy fill:#8250df,stroke:#5a32a3,color:#fff
    classDef simulation fill:#bf8700,stroke:#875e00,color:#fff

    class A input
    class B,C agent
    class D,F policy
    class G simulation

The agent receives a natural language instruction and routes it to a simulation tool. The tool coordinates with a policy provider to generate action chunks, which are executed in the simulation environment. Observations flow back for the next inference cycle. In SteppedSimEnv mode, camera images and state are also returned to the agent so it can reason about progress and adapt.

Architecture

flowchart TB
    subgraph Agent["🤖 Strands Agent"]
        NL[Natural Language Input]
        Tools[Tool Registry]
    end

    subgraph SimTool["🦾 Simulation Tool"]
        direction TB
        SE[SimEnv:<br/>Full Episode Execution]
        SSE[SteppedSimEnv:<br/>Iterative Control]
        TM[Task Manager]
        AS[Async Executor]
    end

    subgraph Policy["🧠 Policy Layer"]
        direction TB
        PA[Policy Abstraction]
        GP[GR00T Policy]
        MP[Mock Policy]
        CP[Custom Policy]
    end

    subgraph SimLayer["🔧 Simulation Layer"]
        direction TB
        ENV[Environment Abstraction]
        SUITES[Task Suites]
        CAM[Camera Interfaces]
        STATE[State Management]
    end

    NL --> Tools
    Tools --> SE
    Tools --> SSE
    SE --> TM
    SSE --> TM
    TM --> AS
    AS --> PA
    PA --> GP
    PA --> MP
    PA --> CP
    AS --> ENV
    ENV --> SUITES
    ENV --> CAM
    ENV --> STATE

    classDef agentStyle fill:#0969da,stroke:#044289,color:#fff
    classDef toolStyle fill:#2ea44f,stroke:#1b7735,color:#fff
    classDef policyStyle fill:#8250df,stroke:#5a32a3,color:#fff
    classDef simStyle fill:#d73a49,stroke:#a72b3a,color:#fff

    class NL,Tools agentStyle
    class SE,SSE,TM,AS toolStyle
    class PA,GP,MP,CP policyStyle
    class ENV,SUITES,CAM,STATE simStyle

Execution modes

SimEnv - full episode execution

The agent specifies a task once and the policy runs the full episode autonomously. This is the simpler mode, suited for benchmarking and well-defined tasks.

from strands_robots_sim import SimEnv

sim_env = SimEnv(
    tool_name="my_sim",
    env_type="libero",
    task_suite="libero_10",
    data_config="libero_10",
)

agent = Agent(tools=[sim_env, gr00t_inference])

# Blocking execution
agent.tool.my_sim(
    action="execute",
    instruction="pick up the red block",
    policy_port=8000,
    max_episodes=5,
    max_steps_per_episode=200,
    record_video=True,
)

# Or async execution with status monitoring
agent.tool.my_sim(
    action="start",
    instruction="stack the blocks",
    policy_port=8000,
    max_episodes=10,
)
agent.tool.my_sim(action="status")
agent.tool.my_sim(action="stop")

SteppedSimEnv - iterative agent control

The agent acts as a planner, executing a limited number of steps per call and receiving camera images and state back. It can then reason about progress, decompose complex tasks into subtasks, and adapt instructions based on what it observes.

from strands_robots_sim import SteppedSimEnv

stepped_sim = SteppedSimEnv(
    tool_name="my_stepped_sim",
    env_type="libero",
    task_suite="libero_10",
    data_config="libero_10",
    steps_per_call=10,
    max_steps_per_episode=500,
)

agent = Agent(tools=[stepped_sim, gr00t_inference])

# Reset to a specific task
agent.tool.my_stepped_sim(
    action="reset_episode",
    task_name="KITCHEN_SCENE1_put_the_black_bowl_on_top_of_the_cabinet",
)

# Execute steps - returns camera images, state, reward, done status
agent.tool.my_stepped_sim(
    action="execute_steps",
    instruction="move gripper toward the bowl",
    policy_port=8000,
    num_steps=10,
)

# Agent observes the result and decides what to do next
agent.tool.my_stepped_sim(action="get_state")

In practice, you hand the full loop to the agent with a planning prompt. The agent decomposes a complex task like "pick up the block and place it in the drawer" into subtasks (locate block, grasp, lift, move to drawer, place), executes each with execute_steps, observes camera feedback, and adapts if something goes wrong.

Comparing the modes

Feature SimEnv SteppedSimEnv
Control flow One-shot execution Step-by-step iteration
Agent feedback Final reward only Camera images + state per batch
Use case Known tasks, benchmarking Complex tasks requiring adaptation
Error recovery None Agent can retry with different instructions

Dual-system architecture

The framework implements a pattern inspired by System 1 / System 2 thinking. The Strands Agent serves as the deliberate planner (System 2) - it reasons about goals, decomposes tasks, and adapts strategy based on observations. The VLA policy serves as the fast executor (System 1) - it maps visual observations and language instructions to motor actions with low latency.

In SimEnv mode, System 2 fires once to specify the task and System 1 handles the rest. In SteppedSimEnv mode, the two systems collaborate iteratively: System 2 observes, plans, and issues instructions every N steps while System 1 executes the low-level control between each planning cycle.

Policy and environment abstraction

The library uses the same Policy abstract class as Strands Robots. It ships with GR00T and mock providers, and you can add custom VLA models by subclassing Policy.

from strands_robots_sim import create_policy

policy = create_policy(provider="groot", data_config="libero", host="localhost", port=8000)
policy = create_policy(provider="mock")

Simulation environments are similarly abstracted through a SimulationEnvironment base class. The library ships with a Libero integration, and the factory supports adding new backends:

from strands_robots_sim.envs import create_simulation_environment

env = create_simulation_environment(env_type="libero", task_suite="libero_10")

Supported task suites

The current Libero integration includes:

Suite Tasks Description
libero_spatial 10 Spatial reasoning tasks
libero_object 10 Object-centric tasks
libero_goal 10 Goal-conditioned manipulation
libero_10 10 Standard benchmark
libero_90 90 Extended benchmark for comprehensive evaluation

Complete example

This example shows the stepped execution mode where the agent plans and adapts:

from strands import Agent
from strands_robots_sim import SteppedSimEnv, gr00t_inference

stepped_sim = SteppedSimEnv(
    tool_name="my_stepped_sim",
    env_type="libero",
    task_suite="libero_10",
    data_config="libero_10",
    steps_per_call=10,
    max_steps_per_episode=500,
)

agent = Agent(tools=[stepped_sim, gr00t_inference])

agent.tool.gr00t_inference(
    action="start",
    checkpoint_path="/data/checkpoints/model",
    port=8000,
    data_config="examples.Libero.custom_data_config:LiberoDataConfig",
)

agent("""
Task: open the top drawer

You are a robot task planner. Decompose this task into subtasks and execute
them step-by-step using the my_stepped_sim tool.

1. Reset the episode with action="reset_episode"
2. For each subtask, call action="execute_steps" with the subtask as instruction
3. Observe camera images and state after each batch
4. Adapt your approach based on what you see
5. Continue until reward reaches 1.0 or the episode ends
""")

agent.tool.gr00t_inference(action="stop", port=8000)