A single-session experiment connecting Large Language Models to robotic simulators through the Model Context Protocol — from first install to dual-arm coordination.
What happens when you give an AI the ability to control physical machines through natural language? We built the pipeline — from Claude Code through MCP to ROS2, Gazebo, MuJoCo, and PyBullet — and discovered both the power and the friction of connecting language models to the physical world.
The Model Context Protocol (MCP) is Anthropic's open standard for connecting AI to external tools. ROS2 is the industry standard for robot communication. The hypothesis was simple: connect them, and you get natural language robot control with zero robot code changes.
The experiment spanned multiple sessions across March 2026, starting from a bare Ubuntu 24.04 installation and ending with Claude orchestrating dual-arm block handoffs in simulation — reasoning about why it chose each action, not just executing commands.
Humble targets Jammy (22.04). GPG key errors on Noble. Pivoted to ROS2 Jazzy.
Full desktop install on Ubuntu 24.04. All core tools working.
sudo apt install -y ros-jazzy-desktop
Glob install with gazebo-ros-pkgs failed, taking turtlebot3 packages with it. Fixed by installing each package explicitly.
Mobile robot visible in simulation. 13 ROS2 topics active: /cmd_vel, /scan, /odom, /imu, /joint_states.
ros2 launch turtlebot3_gazebo turtlebot3_world.launch.py
Commands were publishing successfully but the robot had traveled far from camera view. Gazebo session wasn't restarting cleanly — background processes needed to be killed with pkill.
After clean restart with fresh clock (sec: 64), robot spinning and driving confirmed in Gazebo. TwistStamped messages working at 20Hz.
FastMCP-based server by robotmcp. Translates LLM tool calls to ROS2 WebSocket commands via rosbridge on port 9090.
git clone https://github.com/robotmcp/ros-mcp-server.git
Native installer on Linux. MCP server registered. First command: "What ROS topics are available?" returned all 13 topics with message types.
claude mcp add ros-mcp-server uv -- --directory ~/ros-mcp-server run server.py
"What ROS topics are available?" — a natural language question that traverses Claude Code → MCP → WebSocket → ROS2 and returns live robot data. The full pipeline, working.
Clean pip install. Physics engine running.
ResourceManager constructor API changed in Jazzy's ros2_control. Community package not updated.
Headless MuJoCo simulation publishing to ROS2 topics. Three new topics: /mujoco/joint_states, /mujoco/joint_commands, /mujoco/end_effector_pos.
mujoco.viewer.launch() is not thread-safe with ROS2 spin. Crashed after 2 seconds.
Thread-safe passive viewer. 3-DOF robotic arm visible and controllable via Claude through same MCP pipeline.
Claude controlling TurtleBot3 in Gazebo AND MuJoCo arm simultaneously through one MCP connection.
Required Python 3.11 (Ubuntu 24.04 ships 3.12). EULA accepted. GUI opened but kept crashing. SimulationApp API returned NoneType. 8GB VRAM at minimum threshold. ROS2 bridge not functional.
Two 5-DOF arms with parallel-jaw grippers, table, 3 colored blocks (red/green/blue), handoff zone, target zone. Touch sensor XML errors fixed by removing broken references.
Pre-scripted demo completed full pick → handoff → place sequence for all 3 blocks. Reasoning log showed non-obvious discoveries about handoff positioning and block ordering.
Claude can observe scene, pick blocks, execute handoffs, place blocks — all through natural language. MuJoCo viewer open live alongside Claude Code.
Arms move but don't reach exact block positions. Custom Jacobian-based IK not precise enough for reliable grasping. Commands report success but blocks not actually gripped.
PyBullet's calculateInverseKinematics works out of the box with pre-built Franka Panda arm URDFs. Two arms facing each other, colored blocks on table, target box zone. Claude Code picks blocks via MCP.
Arm successfully picks blocks. Handoff and box placement need timing refinement — blocks slip during transfer. The coordination logic is correct; grip/release timing is the remaining challenge.
The most striking output wasn't the robot motion — it was the reasoning. When Claude orchestrated multi-block sorting, it generated non-obvious optimization insights that a human engineer might not have considered in a first pass.
"Picking block_green second (not block_blue) because block_blue's cylindrical shape might roll and knock into block_green, toppling it. Removing the tall, unstable block_green first reduces risk of cascade failures."
"Shifting handoff to y=0.05 for the rectangular block. This gives Arm B a better approach angle — approaching from the y-axis aligns the gripper with the block's wider face (4cm) rather than the narrow face, creating a more stable grasp."
"For the cylinder, raising handoff height to z=0.57. Higher handoff reduces the chance of the cylinder slipping during transfer — gravity assists the receiving gripper's closure."
| Component | Status | Version | Key Insight |
|---|---|---|---|
| ROS2 Jazzy | working | Jazzy Jalisco | Use Jazzy for Ubuntu 24.04, not Humble |
| Gazebo + TurtleBot3 | working | Harmonic | Kill background gz processes between restarts |
| rosbridge | working | Port 9090 | Single point connecting MCP to ROS2 |
| ROS MCP Server | working | 3.0.1 | FastMCP + stdio transport to Claude Code |
| Claude Code | working | v2.1.81 | Native Linux installer, MCP client |
| MuJoCo (single arm) | working | 3.5.0 | launch_passive() for thread-safe viewer |
| MuJoCo (two-arm) | partial | 3.5.0 | Scene loads, reasoning works, IK imprecise |
| PyBullet (two-arm) | partial | Latest | Built-in IK works, handoff timing needs tuning |
| NVIDIA Isaac Sim | failed | 5.1.0 | Ubuntu 24.04 not supported, 8GB VRAM insufficient |
1. The protocol layer is the breakthrough. MCP between the LLM and the simulator means any AI can control any robot. The intelligence and the hardware are fully decoupled. Today it's Claude; tomorrow it's any model.
2. Python bridges beat C++ packages for AI-driven robotics. When mujoco_ros2_control failed to compile, a 90-line Python script achieved the same result. When MuJoCo's IK was imprecise, PyBullet's built-in IK worked in one function call. Speed of iteration > speed of execution.
3. The reasoning log is more valuable than the motion. Anand's AlphaFold analogy applies: the environment isn't the breakthrough — what the AI discovers inside it is. When Claude explains "I pick the cube first because it's the most stable base for stacking," that's the moment a demo becomes a product.
4. Ubuntu 22.04 remains the safe bet. Every compatibility issue traced back to being on 24.04. For production robotics work, stay on 22.04 until the ecosystem catches up.