Robots
Strands Robots is a Python library for controlling physical robots with natural language. It provides a policy abstraction layer for vision-language-action (VLA) models and a hardware abstraction layer for robot control, letting you tell a robot what to do without programming it.
The library provides a set of Strands Agents tools that handle several components of the robotics stack - from camera capture and servo calibration to policy inference and real-time control loops. An agent equipped with these tools can interpret instructions like "pick up the red block" and translate them into coordinated motor actions.
Getting started¶
Installation¶
pip install strands-robots
Basic usage¶
from strands import Agent
from strands_robots import Robot, gr00t_inference
robot = Robot(
tool_name="my_arm",
robot="so101_follower",
cameras={
"front": {"type": "opencv", "index_or_path": "/dev/video0", "fps": 30},
"wrist": {"type": "opencv", "index_or_path": "/dev/video2", "fps": 30},
},
port="/dev/ttyACM0",
data_config="so100_dualcam",
)
agent = Agent(tools=[robot, gr00t_inference])
# Start the inference service
agent.tool.gr00t_inference(
action="start",
checkpoint_path="/data/checkpoints/model",
port=5555,
data_config="so100_dualcam",
)
# Control the robot with natural language
agent("Use my_arm to pick up the red block using GR00T policy on port 5555")
The Robot class is a Strands AgentTool that the agent can invoke directly. When the agent decides to use the robot, it calls the tool with an instruction and policy port, and the tool handles the entire observation-inference-action loop internally.
How it works¶
The system chains together three layers: a Strands Agent that interprets natural language, a policy provider that maps camera observations and instructions to action chunks, and a hardware abstraction layer that sends those actions to physical actuators.
graph LR
A[Natural Language<br/>'Pick up the red block'] --> B[Strands Agent]
B --> C[Robot class]
C --> D[Policy Provider]
C --> E[Hardware Abstraction]
D --> F[Action Chunk]
F --> E
E --> G[Robot Hardware]
classDef input fill:#2ea44f,stroke:#1b7735,color:#fff
classDef agent fill:#0969da,stroke:#044289,color:#fff
classDef policy fill:#8250df,stroke:#5a32a3,color:#fff
classDef hardware fill:#bf8700,stroke:#875e00,color:#fff
class A input
class B,C agent
class D,F policy
class E,G hardware
Each control cycle, the Robot class captures observations (camera frames and joint states), sends them to the policy for inference, receives an action chunk, and executes those actions on the hardware.
Architecture¶
flowchart TB
subgraph Agent["🤖 Strands Agent"]
NL[Natural Language Input]
Tools[Tool Registry]
end
subgraph RobotTool["🦾 Robot Class"]
direction TB
RT[Robot Class]
TM[Task Manager]
AS[Async Executor]
end
subgraph Policy["🧠 Policy Layer"]
direction TB
PA[Policy Abstraction]
GP[GR00T Policy]
MP[Mock Policy]
CP[Custom Policy]
end
subgraph Inference["⚡ Inference Service"]
direction TB
DC[Docker Container]
ZMQ[ZMQ Server :5555]
TRT[TensorRT Engine]
end
subgraph Hardware["🔧 Hardware Layer"]
direction TB
LR[LeRobot]
CAM[Cameras]
SERVO[Feetech Servos]
end
NL --> Tools
Tools --> RT
RT --> TM
TM --> AS
AS --> PA
PA --> GP
PA --> MP
PA --> CP
GP --> ZMQ
ZMQ --> TRT
TRT --> DC
AS --> LR
LR --> CAM
LR --> SERVO
classDef agentStyle fill:#0969da,stroke:#044289,color:#fff
classDef robotStyle fill:#2ea44f,stroke:#1b7735,color:#fff
classDef policyStyle fill:#8250df,stroke:#5a32a3,color:#fff
classDef infraStyle fill:#bf8700,stroke:#875e00,color:#fff
classDef hwStyle fill:#d73a49,stroke:#a72b3a,color:#fff
class NL,Tools agentStyle
class RT,TM,AS robotStyle
class PA,GP,MP,CP policyStyle
class DC,ZMQ,TRT infraStyle
class LR,CAM,SERVO hwStyle
Control flow¶
sequenceDiagram
participant User
participant Agent as Strands Agent
participant Robot as Robot Class
participant Policy as Policy Provider
participant HW as Hardware
User->>Agent: "Pick up the red block"
Agent->>Robot: execute(instruction, policy_port)
loop Control Loop
Robot->>HW: get_observation()
HW-->>Robot: {cameras, joint_states}
Robot->>Policy: get_actions(obs, instruction)
Policy-->>Robot: action_chunk
loop Action Horizon
Robot->>HW: send_action(action)
Note over Robot,HW: sleep
end
end
Robot-->>Agent: Task completed
Agent-->>User: "Picked up red block"
Core concepts¶
Robot class¶
The Robot class wraps a robot and exposes it as a Strands agent tool with four actions:
| Action | Behavior | Use case |
|---|---|---|
execute |
Blocks until the task completes or times out | Single-step tasks |
start |
Returns immediately, runs task in background | Long-running tasks |
status |
Reports current task progress | Monitoring async tasks |
stop |
Interrupts a running task | Emergency stop |
# Blocking - agent waits for completion
agent("Use my_arm to pick up the red block using GR00T policy on port 5555")
# Async - agent can check status or do other work
agent("Start my_arm waving using GR00T on port 5555, then check status")
# Stop
agent("Stop my_arm immediately")
Constructor parameters:
| Parameter | Type | Description |
|---|---|---|
tool_name |
str |
Name the agent uses to reference this robot |
robot |
str, RobotConfig, or Robot |
Robot type string (e.g. "so101_follower"), a config object, or a pre-built robot instance |
cameras |
dict |
Camera configuration mapping names to settings |
port |
str |
Serial port for the robot (e.g. "/dev/ttyACM0") |
data_config |
str |
Policy data configuration name |
control_frequency |
float |
Control loop frequency in Hz (default: 50) |
action_horizon |
int |
Number of actions to execute per inference step (default: 8) |
Policy abstraction¶
Policies are the bridge between observations and actions. The library defines a Policy abstract class that any VLA model can implement:
from strands_robots import Policy, create_policy
# GR00T policy (ships with the library)
policy = create_policy(
provider="groot",
data_config="so100_dualcam",
host="localhost",
port=5555,
)
# Mock policy (for testing without hardware)
policy = create_policy(provider="mock")
The create_policy factory ships with "groot" and "mock" providers. You can integrate additional VLA models by subclassing Policy and implementing get_actions() and set_robot_state_keys().
Inference management¶
The gr00t_inference tool manages policy inference services running in Docker containers.
# Start with TensorRT acceleration
agent.tool.gr00t_inference(
action="start",
checkpoint_path="/data/checkpoints/model",
port=5555,
data_config="so100_dualcam",
use_tensorrt=True,
)
# Check status
agent.tool.gr00t_inference(action="status", port=5555)
# Stop
agent.tool.gr00t_inference(action="stop", port=5555)
Available actions: start, stop, status, list, restart, and find_containers.
Additional tools¶
Beyond the core robot and inference tools, the library includes several utilities that the agent can use for setup, calibration, and data collection.
Camera tool¶
Camera management supporting OpenCV and RealSense cameras.
from strands_robots import lerobot_camera
agent = Agent(tools=[lerobot_camera])
agent("Discover all connected cameras")
agent("Capture images from front and wrist cameras")
agent("Record 30 seconds of video from the front camera")
Actions: discover, capture, capture_batch, record, preview, test.
Teleoperation tool¶
Record demonstrations for imitation learning using a leader-follower setup.
from strands_robots import lerobot_teleoperate
agent.tool.lerobot_teleoperate(
action="start",
robot_type="so101_follower",
robot_port="/dev/ttyACM0",
teleop_type="so101_leader",
teleop_port="/dev/ttyACM1",
dataset_repo_id="my_user/cube_picking",
dataset_single_task="Pick up the red cube",
dataset_num_episodes=50,
)
Actions: start, stop, list, replay.
Pose tool¶
Store, retrieve, and execute named robot poses for repeatable positioning.
from strands_robots import pose_tool
agent = Agent(tools=[robot, pose_tool])
agent("Save the current position as 'home'")
agent("Go to the home pose")
agent("Move the gripper to 50%")
Actions: store_pose, load_pose, list_poses, move_motor, incremental_move, reset_to_home.
Serial tool¶
Low-level serial communication for servos and custom protocols.
Actions: list_ports, feetech_position, feetech_ping, send, monitor.
Complete example¶
from strands import Agent
from strands_robots import Robot, gr00t_inference, lerobot_camera, pose_tool
robot = Robot(
tool_name="orange_arm",
robot="so101_follower",
cameras={
"wrist": {"type": "opencv", "index_or_path": "/dev/video0", "fps": 15},
"front": {"type": "opencv", "index_or_path": "/dev/video2", "fps": 15},
},
port="/dev/ttyACM0",
data_config="so100_dualcam",
)
agent = Agent(tools=[robot, gr00t_inference, lerobot_camera, pose_tool])
agent.tool.gr00t_inference(
action="start",
checkpoint_path="/data/checkpoints/gr00t-wave/checkpoint-300000",
port=5555,
data_config="so100_dualcam",
)
while True:
user_input = input("\n> ")
if user_input.lower() in ["exit", "quit"]:
break
agent(user_input)
agent.tool.gr00t_inference(action="stop", port=5555)
This gives you an interactive loop where you can issue natural language commands to the robot, check camera feeds, save poses, and manage inference services - all through conversation with the agent.