What's the most annoying part of your ROS 2 workflow?

Been spending more time in ROS 2 lately and getting a feel for the daily grind.

For folks working on real robots, what part of going from code change to seeing it run drives you the most nuts? And have you found anything that actually makes it better?

When testing some changes on the robot I’d used to rsync my workspace to the robot, but with a robot running on an ARM platform (jetson for example) that isn’t possible anymore due to different ISA. This means the change to test loop is much longer. I’d like to have a solution for that.

What is also anoying is bug reports that you can’t reproduce. Luckily our customers have gotten quite good at that and provide rosbags with timestamps so finding the issue is much quicker.

For me the most anying part is working on somebody else’s code. It just feels much nicer to work on your own code :slight_smile:

2 Likes

Haha the “working on someone else’s code” one is universal :grin:

The ARM jump sounds rough though. Curious did you try anything to speed that loop back up?

We’ve tried docker multiplatform builds, but that runs qemu under the hood and so is not as quick as you would hope. But it gets the job done.

For CI we’ve added some native ARM runners hosted on Hetzner

1 Like

without a doubt, compiling things.

afaik because the rclcpp headers end up including roughly 3 bajillion things, you end up with rather long compile times (actually, it might be the linking step, but I haven’t profiled it. it’s one of the two).
add onto that how colcon, cmake, and setuptools/distutils are just annoying to deal with, and the experience is rather sub-par.

I have some rather simple projects which take 30+ seconds to recompile due to one change in one (non-header) file, and that’s with clang & ccache (I tried getting a faster linker like mold to work but could not due to how my ROS environment is set up). with gcc it’d be even longer.

oh, and then of course how colcon slows down the more files & nested subfolders you have because it does a slow re-scanning step every time you run it.

​

​

and, on cross-compilation: compilation does not need to be done with the target architecture being the host architecture! clang is able to compile to a given architecture with the host architecture being different perfectly fine with zero issues. it has an option for setting the target triple.
there is zero reason that we need to be dealing with using docker or needing to get a native arm machine just to compile for arm.
or, since ROS controls the entire build process via colcon, why isn’t there something set up for cross compiling, and you just need to install like -aarch64 versions of different packages, then you can build with colcon build --target arm64v8.2a-unknown-linux-gnu for an nvidia jetson orin nano, and then I can just rsync the resulting install directory over to the jetson (perhaps also it may be good to have that when you specify a target, it also sets build-base and install-base to target/[target]/build/ and target/[target]/install/, where target is the previously specified (normalized) target target triple, e.g. target/aarch64v8.2-unknown-linux-gnu/install is the install path for the target triple arm64v8.2-unknown-linux-gnu (arm64 is an alias for aarch64, in llvm).
and because the entire build process is controlled by colcon, it will know all the right arguments to pass to cmake to ensure that it can find all the libraries, headers, etc.

imo, we should be using cross-compilation, not needing to do one of

  • compile on the target
  • a build server using the target architecture
  • emulating the target via QEMU (with or without docker)

all of those are dumb solutions. better options exist! why aren’t we using them?!?

2 Likes

Makes sense. Have you actually got the clang cross-compile working end to end or is it more the thing you keep meaning to do? Wondering if the wall is really the deps/sysroot side

For me, it’s the pervasive use of declarative tools. The hard-to-use and debug launch system, popularity of YAML engineering via plugins vs. letting users write their own main, describing behavior via XML instead of code, etc.

Sometimes you just want to use a library to build something new that doesn’t fit in a neat box of existing abstractions. While this is a general software trade-off, I think that the ROS ecosystem could stand to have more diversity of approaches instead of sometimes defending declarative as the only way to do things.