I’m currently developing a ROS2 driver for an event based camera. The sensor publishes data at very high rates (1khz) and fairly large messages (80kb).
Wrote the driver in ROS1: works like a charm.
Wrote it for ROS2 (Galactic on Ubuntu): terrible performance.
This is not my first bout with ROS2. I’ve gotten a bloody nose when trying to write ROS2 drivers for other sensors as well. RMW performance was terrible (back then under Eloquent) and I couldn’t figure out what was going on.
Before completely throwing in the towel with respect to ROS2, I decided to file a proper bug report against the rmw_cyclonedds.
I have also created a very small repo that demonstrates the issues I’m experiencing (the code has more details, it’s less than 100 lines).
So I’m trying to send custom messages (not something unusual for a fairly exotic sensor) which have a variable length array of a custom type struct (Event.msg), which looks like this:
uint16 x
uint16 y
builtin_interfaces/Time ts
bool polarity
Each Event has (ballpark) 2 + 2 + 8 + 1 =~ 16 bytes. The sensors delivers about 50 Mio events per second, so if we bunch them up in packets of 50,000 (ideally we’d be using even smaller ones), that means sending 800kb messages at 1000Hz.
ROS1
- publication rate: 1000Hz
No problem: can publish messages with 50,000 events (0.650MB) at 1000Hz - rostopic bw shows 650MB/s bandwidth
- rostopic hz shows 1000Hz
In fact, I can send up to 100,000 events per message (the next gen sensor will require 150k per message!), hit about 1.3GB/s in bandwidth, and ROS1 is still ok for both publisher and receiver.
ROS2 (with cyclone DDS:)
- publication rate: 31 Hz! Instead of 1000Hz. The call to publish() takes forever to return. This is without even any subscriber for the topic! Why so slow? Beats me.
- monitor the bandwidth with ros2 topic bw: 27Mb/s, message size 0.8MB.
- rostopic hz: reports 3Hz. The same thing on ROS1 would report 1000Hz, just for comparison.
ROS2(with fastrtps DDS):
- publication rate: 950Hz, instead of ROS1’s 1000Hz. The node is running at 95% CPU, so apparently just calling publish() without any subscribers attached involves significant overhead.
- bandwidth: 550 MB/s, but the moment ros2 topic bw is run, the publication rate drops to 650Hz
- rostopic hz: reports 3Hz, just like cyclonedds.
When I ran into similar issues a little more than a year ago I shrugged them off as “well, ROS2 may just not be quite ready for prime time, it’ll be sorted out”. Maybe I was just doing something wrong? But now it’s a year later and still I’m seeing absolutely wonky behavior when using some very basic ROS2 functionality. Apparently I’m not the only one.
For me this is a real show stopper for ROS2 adoption. I’m not saying this because I don’t like ROS, quite the opposite: the success of ROS hinges on the one of ROS2, and for ROS2 to succeed, such blatant performance issues need to be sorted out urgently. Preferably in a way that the developer/user does not have to become a RMW expert and tune a dozen parameters to get it to perform. ROS1 is exemplary when it comes to this (see above).
Lastly, I’m somewhat surprised that the complaints about rmw performance weirdness are not more massive. Am I among the few who just don’t get how to write ROS2 code? Or is everybody else just too nice to bring this up?