Hi everyone,
Like many of us, I’ve been experimenting with giving LLMs control over robot hardware. However, I quickly ran into the classic problems: LLMs hallucinate actions, assume prerequisites that haven’t been met (e.g., trying to drive a humanoid before stabilizing it), and most existing integrations are just tightly coupled, hardcoded scripts.
To solve this, I built ros2_lingua — an open-source bridge that introduces a structured capability contract between ROS 2 nodes and LLMs.
Instead of letting the LLM guess what topics or actions to call, ros2_lingua forces the LLM to output a plan based only on explicitly registered capabilities, and uses a backward-chaining planner to automatically inject missing prerequisite steps.
How it works:
- Capability Advertisement: Any ROS 2 node can inherit from LinguaMixin to self-advertise its capabilities at boot. It defines its name, ROS action/service, parameters, preconditions, and postconditions.
- Backward-Chaining Planner: When a user gives a natural language instruction (e.g., “go to the table and pick up the bottle”), the Grounding Engine checks the robot’s current state against the capability schema. If the robot isn’t balanced, the planner automatically injects a stabilize_robot capability before the navigation step.
- Safe Dispatch: The DispatcherNode safely executes the validated plan over standard ROS 2 actions and services.
Decoupled Architecture
One of my main goals was to ensure the core logic was highly testable. The project is split into two layers:
ros2_lingua_core: A pure Python library containing the schema, registry, planner, and LLM backends (Ollama, OpenAI, Anthropic). It has zero ROS 2 dependencies, meaning the grounding engine can be unit-tested purely in Python.
ros2_lingua: The ROS 2 interface layer containing the GroundingNode, DispatcherNode, and mixins.
Links & Demo
You can see a demo of the engine running with a local Ollama model and a mock humanoid setup, along with the full architecture documentation here:
Documentation & Architecture: ros2_lingua — Documentation
GitHub Repository: GitHub - purahan/ros2_lingua: Natural language to ROS2 actions — a structured LLM grounding engine for any robot. · GitHub
What’s Next & Feedback Request
The project is currently a working prototype in Python. My immediate roadmap includes taking this to a release-ready state and building a C++ bridge so native controller nodes can easily advertise their capabilities.
Since this is early development, I would love to get feedback from the community on the architecture—specifically on the schema design for the capability registry and how best to handle complex, long-running action pre-emptions within the Dispatcher.
Thanks for your time, and I’d love to hear your thoughts!