Design discussion: “discoverable not serviceable” ROS interfaces for lifecycle nodes

Lotusymt · March 2, 2026, 5:22pm

Hi all,

We’d like to start a design discussion about a lifecycle-related gap that shows up in Nav2 but seems fundamentally ROS 2 / rclcpp-level.

Context / links

Nav2 PR where this surfaced: nav2_ros_common: add lifecycle-managed Subscription wrapper by Lotusymt · Pull Request #5834 · ros-navigation/navigation2 · GitHub
Related Nav2 issue: nav2_ros_common interfaces for Services, Subscription, and Action to be lifecycle enabled · Issue #5298 · ros-navigation/navigation2 · GitHub

What we want (high-level)

For lifecycle nodes, it’s very useful to have all interfaces created and visible while the node is Inactive so tooling and other nodes can introspect what exists (topics/services/actions/etc.). But while Inactive, those interfaces should be not serviceable — i.e., they should not actually “do work” or execute callbacks until the node transitions to Active.

This is straightforward for some interface types:

Publishers / clients can exist early and simply refuse to do work when asked (check a flag → return / no-op / error).

But it’s harder for:

Subscriptions and services, because once they’re created they can become serviceable immediately (executor dispatch / callbacks), unless there is a mechanism to say “no” before dispatch and (importantly) before consuming from the middleware queue.

The core gap

We want an interface that is:

Discoverable (exists on the graph / matched / visible to introspection) in Inactive
Not serviceable (no callbacks executed / no work performed) until the node is Active

The tricky part is that “not serviceable” should ideally mean more than “drop in the callback”:

If a subscription still takes samples from the RMW queue while Inactive (even if we drop them afterward), we may permanently lose data that is not re-sent later (esp. for Transient Local / latched-like topics).

LifecycleEntity state model (concise proposal)

This is a rough state/semantics model that seems to match user expectations, plus feasibility notes.

Unconfigured (not configured yet)

Ideal semantics: not discoverable (should not appear in ros2 topic list / ros2 service list).
Reality today: hard to guarantee, because creating an rclcpp interface usually creates the underlying RMW entity (e.g., DDS DataReader/DataWriter), which becomes discoverable immediately.
Implication: “unconfigured == not discoverable” may require future rclcpp↔rmw interfaces/state control, so it’s likely a longer-term enhancement.

Inactive (discoverable but not serviceable) — the missing piece

Desired semantics: discoverable/matched on the graph, but no work is performed.
For subscriptions, “not serviceable” should ideally include not consuming from the RMW queue while inactive (so data is preserved per QoS and can be processed after activation).
Possible approaches:
1. rclcpp-level gating before take/dispatch: when data-ready triggers, skip take() and callback dispatch until Active.
2. rmw-level paused state: entity is discoverable but rmw does not deliver/allow take until Active.

Active (normal behavior)

Take from the RMW queue and dispatch callbacks as usual.
On activation, there may already be queued samples (QoS depth/history/transient-local), so activation may result in immediate processing of multiple callbacks.

Looking for feedback / prior art

We have a preliminary plan around the approaches above (especially “don’t take while inactive” vs an rmw-level paused state), and would really appreciate suggestions or references to prior work.

In particular:

Do DDS implementations (or Zenoh / rmw_zenoh) already have an internal notion of managed states for endpoints (discoverable but paused / not serviced) that could be mapped cleanly onto ROS 2 lifecycle entities?
If not, where do folks think the cleanest abstraction should live (rclcpp vs rmw), and what pitfalls should we watch for?

Thanks in advance for any design suggestions or pointers to existing discussions/issues.

JM_ROS · March 2, 2026, 6:01pm

I got the feeling what you really want here is a pre defined model of your node.

E.g.: You want a guarantee, that you node will only provide these services / topics and only consume this or that data. Armed with model data you can do a lot of nice things, like check if you got unconnected consumers or compute startup orders of your nodes.
A while back someone posted a model to node generation approach, I think it was someone from fraunhofer (but my memory is hazy there…)

On a more practical level: we ran into the same problem and solved it for our own lifecycle system like this:
Every publisher subscriber etc is instantiated in on configure. We use special subclassed entities (publishers, subscribers and services).

If something in the business logic is publishing or calling a service while not active, an exception is thrown, as it is a bug.

All data received prior to being moved to active is discarded. Note if the connection is transient local, we buffer the last msg and dispatch directly after the move to active. Service calls are rejected if not active.

All entities are registered at the lifecycle and are automatically ‘armed’ and ‘disarmed’ during the transitions from inactive to active and back.

All of this is a high level concept, and I don’t see it as related to the rmw layer at all. Also not that depending on your application, you will most likely run into corner cases, were you will need to break the rules and allow processing of some data in any state.

peci1 · March 2, 2026, 9:35pm

Couldn’t this be technically solved by content-filtered topics? I’m not sure if it’s possible to update the content filter during runtime, though, and I’m also not sure how it behaves with latched topics…

smac · March 4, 2026, 5:29pm

Even if it could, this is a general purpose problem that needs a solution IMO. I think what we’re asking for here is to have a version of create_subscription with a subscription option / bool to indicate that this should be discoverable but not accepting anything from the waitset or eq. yet. I think the need for lifecycle enabled subscriptions / services warrant this.

The reason Actions don’t need this is because Actions can simply reject a goal request and tell the client it was outright rejected. Services* and Subscriptions don’t have that ability since they simply process data and can’t tell the client/publisher that it was inactive to try again later or ROS 2 to resend a transient local topic (i.e. map) again once ready. We end up dropping sent-once data as a result. Timers I imagine have the same issue.

Callback-triggering interfaces need (I think) some way of being created but inactive on the middleware to process if we want the ability to have lifecycle-driven interfaces where we meet the Lifecycle/Managed Node Design intent of having the allocations happen in the Configure transition. That is the cleanest solution that I can see & creates an important capability with additional use-cases. Else, we would have to create all subscriptions/services in the Activate transition, which I don’t think anyone wants.

/* (generically; if you add a ‘success’ bool in the response, sure, but then you’d have to add error codes galore to know if its a server error or a activation error, etc)

JM_ROS · March 4, 2026, 7:21pm

I think you can implement what you want with a custom waitable.

Just don’t process the data from your inner entities as long as inactive and what you will get is

Discoverable entities
All data after creation is buffered in the rmw layer

It this the behavior you want to archive ? Or did I miss something ?

Topic		Replies	Views
All Nodes as Lifecycle Nodes ROS General ros2	12	1640	September 22, 2025
Launch wrapper for rclcpp::Node to make it a LifecycleNode ROS General ros2 , design , lifecycle , rclcpp , launch	2	1925	November 22, 2021
Lifecycle node improvements ROS General	0	2292	August 3, 2023
lifecore_ros2: composing reusable ROS 2 Python lifecycle components Projects ros2 , lifecycle , python , jazzy	0	89	May 7, 2026
Activation trees for Managed Nodes ROS General ros2 , lifecycle , eloquent	0	706	March 13, 2020