Hi ROS community,
I’m working on ClawsJoy Robotics, a proposal for a standard interface between LLM-based cognitive systems and ROS/2 robots. The core idea: any LLM outputs an action, any robot implements these standard topics.
The Problem
Every robotics company maps “pick up the cup” to different ROS/2 interfaces:
-
Company A: custom service
/manipulation/pick -
Company B: action server
/grasp_object -
Company C: proprietary SDK
There’s no standard way for an LLM agent to tell a robot what to do.
The Proposal: 12 Standard Motion Primitives
navigate_to → /base/cmd_vel (geometry_msgs/Twist)
move_arm → /arm/joint_trajectory (trajectory_msgs/JointTrajectory)
move_base → /base/cmd_vel (geometry_msgs/Twist)
grasp → /gripper/command (std_msgs/Float32, 0=close 1=open)
release → /gripper/command (std_msgs/Float32)
push → /base/cmd_vel (geometry_msgs/Twist)
look_at → camera PTZ control
detect_objects → /perception/objects (vision_msgs/Detection3DArray)
get_pose → /base/odom + /arm/joint_states
speak → /speech/tts (std_msgs/String)
play_gesture → /arm/joint_trajectory (pre-defined trajectories)
wait → no topic (sleep)
emergency_stop → /safety/estop (std_msgs/Bool)
Demo Output
Natural language → task execution (mock mode):
[Result]:
Task completed in 6 steps
▸ look_at → table
▸ detect_objects → cup detected
▸ navigate_to → table
▸ move_arm → grasp position
▸ grasp → cup_01
▸ speak → “已拿到水杯” (cup fetched)
[Result]:
Task completed in 6 steps
▸ navigate_to → waypoint_1
▸ look_at → front
▸ speak → “waypoint_1 clear”
▸ navigate_to → waypoint_2
▸ look_at → left
▸ speak → “waypoint_2 clear”
Integration Checklist
To integrate a robot with ClawsJoy Robotics:
-
Arm: publish
/arm/joint_states, subscribe/arm/joint_trajectory -
Gripper: subscribe
/gripper/command(Float32, 0=close 1=open) -
Base: subscribe
/base/cmd_vel, publish/base/odom -
Camera: publish
/camera/rgb,/camera/depth -
Perception: publish
/perception/objects -
Speech: subscribe
/speech/tts -
Safety: subscribe
/safety/estop
Then set mock_mode=False and run.
Code
https://github.com/your-org/clawsjoy_robotics
12 primitives defined in motion_primitives.yaml. Full topic mapping in bridge/ros2_bridge.py. Task planner in planner/planner.py.
Questions for the community
-
Are these 12 primitives sufficient, or should we add/remove some?
-
Is
trajectory_msgs/JointTrajectorythe right interface formove_arm, or should we use ROS/2 Actions? -
Should we adopt the existing
ros2_controlinterfaces instead of defining new topics? -
Is this worth standardizing, or is every robot too different?
Looking forward to feedback.