Hi all,
We’d like to start a design discussion about a lifecycle-related gap that shows up in Nav2 but seems fundamentally ROS 2 / rclcpp-level.
Context / links
- Nav2 PR where this surfaced: nav2_ros_common: add lifecycle-managed Subscription wrapper by Lotusymt · Pull Request #5834 · ros-navigation/navigation2 · GitHub
- Related Nav2 issue: nav2_ros_common interfaces for Services, Subscription, and Action to be lifecycle enabled · Issue #5298 · ros-navigation/navigation2 · GitHub
What we want (high-level)
For lifecycle nodes, it’s very useful to have all interfaces created and visible while the node is Inactive so tooling and other nodes can introspect what exists (topics/services/actions/etc.). But while Inactive, those interfaces should be not serviceable — i.e., they should not actually “do work” or execute callbacks until the node transitions to Active.
This is straightforward for some interface types:
- Publishers / clients can exist early and simply refuse to do work when asked (check a flag → return / no-op / error).
But it’s harder for:
- Subscriptions and services, because once they’re created they can become serviceable immediately (executor dispatch / callbacks), unless there is a mechanism to say “no” before dispatch and (importantly) before consuming from the middleware queue.
The core gap
We want an interface that is:
- Discoverable (exists on the graph / matched / visible to introspection) in Inactive
- Not serviceable (no callbacks executed / no work performed) until the node is Active
The tricky part is that “not serviceable” should ideally mean more than “drop in the callback”:
- If a subscription still takes samples from the RMW queue while Inactive (even if we drop them afterward), we may permanently lose data that is not re-sent later (esp. for Transient Local / latched-like topics).
LifecycleEntity state model (concise proposal)
This is a rough state/semantics model that seems to match user expectations, plus feasibility notes.
Unconfigured (not configured yet)
- Ideal semantics: not discoverable (should not appear in
ros2 topic list/ros2 service list). - Reality today: hard to guarantee, because creating an rclcpp interface usually creates the underlying RMW entity (e.g., DDS DataReader/DataWriter), which becomes discoverable immediately.
- Implication: “unconfigured == not discoverable” may require future rclcpp↔rmw interfaces/state control, so it’s likely a longer-term enhancement.
Inactive (discoverable but not serviceable) — the missing piece
- Desired semantics: discoverable/matched on the graph, but no work is performed.
- For subscriptions, “not serviceable” should ideally include not consuming from the RMW queue while inactive (so data is preserved per QoS and can be processed after activation).
- Possible approaches:
- rclcpp-level gating before take/dispatch: when data-ready triggers, skip
take()and callback dispatch until Active. - rmw-level paused state: entity is discoverable but rmw does not deliver/allow take until Active.
- rclcpp-level gating before take/dispatch: when data-ready triggers, skip
Active (normal behavior)
- Take from the RMW queue and dispatch callbacks as usual.
- On activation, there may already be queued samples (QoS depth/history/transient-local), so activation may result in immediate processing of multiple callbacks.
Looking for feedback / prior art
We have a preliminary plan around the approaches above (especially “don’t take while inactive” vs an rmw-level paused state), and would really appreciate suggestions or references to prior work.
In particular:
- Do DDS implementations (or Zenoh / rmw_zenoh) already have an internal notion of managed states for endpoints (discoverable but paused / not serviced) that could be mapped cleanly onto ROS 2 lifecycle entities?
- If not, where do folks think the cleanest abstraction should live (rclcpp vs rmw), and what pitfalls should we watch for?
Thanks in advance for any design suggestions or pointers to existing discussions/issues.