Hi, great question.
A good sensor package should provide the most standard interface possible to minimise overhead and regression in the event of replacement by another sensor (really import for product development).
I think now a simulation (even basic) of the sensor in gazebo can really improve the user experience.
In addition, it should be implemented with dynamic parameters for tuning. Sensors packages should also be tested and documented to work on the main embedded platforms (eg: Jetson with jetpack and rapsberry pi) with the necessary manual steps specific to the platform.
For the calibration, it depends of the sensor complexity. For simple sensors, scripts and references curves in plotjuggler should be enough. For more complexe ones (eg: Stereo camera), the GUI provided by Stereolabs is a perfect example.
Thanks scastro. I did parse this yaml file rosdistro/humble/distribution.yaml at master ¡ ros/rosdistro ¡ GitHub and yes I did miss out realsense-ros in the final filtering. Gonna update the list
I think a âgoodâ Sensor Package should include some sort of connection timeout handling!? But this is actually also a question I want to ask here.
TLDR: How do you ensure all sensors are indeed publishing data reliably and how do you handle the situation if a sensor stops working?
The problem I want to point to is not with the rosnode connecting to other nodes but rather the hardware connection like CAN, TCP, UART, SPI,⌠or simply not sending data.
I see some âgoodâ sensor packages handle a connection timeout during startup, like the realsense camera pkg will retry several times to connect to the camera. If it still fails the node will shut down. However all sensor packages known to me miss the error handling when the connection got lost during runtime or the sensor stopped working. The behaviour I see is that the node still keeps running, the topics still show up but thereâs no data published anymore.
The âbadâ package doesnât even do the check at startup. The rosnode keeps running normally, in the best case there might be an error message showing up in the terminal, but no further action. Just waiting indefinitely for the sensorâs data to arrive.
So my question to the community is: How do you ensure all sensors are indeed publishing reliably and how do you handle the situation if a sensor stopped sending data?
My workaround is to create a node which will subscribe to some topics and if they timeout, I can do some failure-actions like restart or stop the robot. But this feels just like a quick and dirty fix. I tried rosmon but it also only shows the status of the rosnode running or not. It can not check if the node actually publishes messages. Could Iâve missed it? Should the sensor package deal with this issue or an external monitoring node similar to rosmon?
On the flip side, I was doing some sensor evaluation a few months back. Some of the cameras I was looking at had ROS2 packages written, but they didnât seem to be publishing them. It isnât a big thing but it did factor into the evaluation.
I really like this discussion! Hopefully my experiences can add something to it.
What makes for a âwell-writtenâ sensor package?
Above all else documentation is key. For us at Nauticus many of the sensors we deal with are used exclusively in the subsea industry and donât come with their own ROS packages so we end up writing them ourselves. A few things I like to see (and yes some just applies to ROS packages in general):
- Documentation - The entire user facing API should be documented. Every setting, connection/port, service/action/topic, methods of debugging/logging, any vendor provided software, recommended settings in various situations, whether a setting can be changed at runtime.
- Parameter limits - This kind of extends from Documentation above but every ROS parameter should make full use of the
ParameterDescriptor
, filling in thedescription
and floating point or integer ranges where applicable. It pains me to see ROS nodes where an integer value is being checked in the parameter callback when it could have easily just been defined in the description. The reason this is important is the parameters shouldnât allow me to set values that are invalid for the sensor (we have many very expensive sensors that could be permanently damaged if we didnât have these checks in place). Also making use ofread_only
for things that shouldnât change at runtime, or providing checks to ensure that the user cannot change something while a sensor is not in a state where that setting can be changed. Also, depending on the use case, if the sensor itself provides some setting and that is not exposed via a parameter that can be annoying. - Launch setup / Standalone mode - Many sensors we use are working together, but then we also need to be able to test them independently. I believe all drivers should be composable but also provide a launch file and main to allow running on their own.
- Time source - Time source should be well defined. Whether the sensor itself is providing a timestamp because itâs wired to a time source, has its own internal clock, or none at all, this can make or break how the sensor data gets used. On sensors which use PTP or whatever the âreceiptâ time can absolutely be different from the âcaptureâ time and make a big difference on how you process that data.
- C++ - I know this is a sore spot with a lot of folks but just personal preference. Most sensors are fairly performance oriented and I like seeing them being written in (or at least available in) C++. With a sensor or two itâs probably not that big of a deal but once you get into 9-10 sensors all operating at the same time on robot grade hardware it becomes more important.
- Common interfaces - This is more of a recommendation for robots with many disparate sensors but having common methods of operation has helped us greatly. In other words, if sensors can have their data started/stopped/recorded in some way, make all of them use similar ROS interfaces (services/actions/topics). This way when it comes to creating user interfaces and/or autonomy it makes things way easier.
- Robust to power/errors - This should have been a close second after Documentation. A sensor driver/package should be runnable whether the sensor exists or not, and should look for the sensor until it is connected or turned on. And likewise if itâs already running and the plug is pulled or power turned off, the driver should gracefully disconnect and continue looking for the sensor to return. This leads me to another oneâŚ
- Log spam - This one drives me up the wall. Itâs easy to log âsensor disconnected!â or âerror!â to the console but for just a little bit of extra logic you can print âsensor disconnectedâ once and then silently wait for the reconnect. Or if there is some small error that occurs on every frame of data coming at 20 Hz, even with a throttled log youâll still end up with a giant log in the end with tons of repeated (useless) data.
Sensor calibration and cross-callibration is still a bit of a âblack-artâ in ROS.
IMHO This is something the vendors should provide as a well documented procedure. Also unless itâs required to be done every time you run the sensor Iâd hesitate for it to be included as part of a ROS driver. Calibrating a sensor frame to the as-built robot is another ballgame altogether and Iâd love a turnkey solution but⌠âblack-artâ indeed.
What steps do you take to to integrate simulated sensors into your robot model in Gazebo? What would you like to see sensor vendors provide?
While this would be amazing, again the sensors we use like specialty sonars and echosounders, we rarely even expect to have ROS drivers much less simulations. That said we get by with some of the existing gazebo sensors. Itâs really only good for testing usage of a particular sensor though and often doesnât necessarily provide enough realistic data to give us a true analog. Itâs kind of the problem of the âperfect point cloudâ you get from simulated sensors whereas the ones in the real world can be pretty noisy or unreliable. I would love if the detailed aspects of a particular sensor could be explored in sim (e.g. effects of water turbidity on a specific camera sensor simulated) but I think thatâs more than most people would want to provide.
How important is it for sensor vendors to provide support for Tier 2 and Tier 3 supported operating systems ?
Not terribly important for us as we just roll the olâ Ubuntu.
ROS 1 supported nodelets which were included in launch file of the sensors and allowed multiple nodes to share the same process. How are you performing similar interprocess communication in ROS 2? What would your recommendations be for similar features in ROS 2?
See above comment about composable nodes.
In ROS 1, the
SubscriberStatusCallback
was a key element in image pipeline to coordinate the sequence of nodes from layering to rectification. ROS 2 lacks this callback requiring frequent checks to find active subscribers. How are you achieving similar results in ROS 2?
I may just be a big dunce or not understanding but the underlying DDS is not actually transporting anything if there are no subscribers? Or is that wrong? We have implemented various âstatesâ within most of our sensors to try and not do heavy calculations on data thatâs not being used but those tend to be tied back to more of a higher level robot state. E.g. if youâre not in some mode where you need your stereo point clouds, then donât bother using processing power to generate them. IMHO this decision shouldnât be up to the ROS driver itself. It should just be told, âhey you generate some point clouds and publish themâ and something higher level can understand the needs. This goes back to the common interfaces
item above.
How are you using lifecycle nodes in ROS 2 to monitor the sensor states?
We havenât bitten on the lifecycle nodes yet as the implementation seemed flakey at the time (maybe we need to take a second look, itâs been a while), but see other comments throughout this post about using composable nodes as well as making the driver sensitive to power/connection/errors.
Robustness and reliability. Our developers run into stability problems with various drivers, and we need to invest time and effort to make it stable. Jitter on receiving sensor data and drops over time matter to a robotics uptime.
We use reliability metrics to prove out the sensors we use, and have released corresponding ROSBag data validation tools.
Thanks
yep, thanks for the recommendation