ROS 2 Cross-compilation / Multi architecture development

Hi,

I’m in the process of looking into migrating our indoor service robot from an amd64 based system to the Jetson Orin Nano.

How are you doing development when targeting aarch64/arm64 machines?

My development machine is not the newest, but reasonably powerful. (AMD Ryzen 9 3900X, 32GB RAM) But it struggles with the officially recommended QEMU based approach. Even the vanilla osrf/ros docker image is choppy under emulation. Building the actual image, stack or running a simulated environment is totally out of the question.

The different pathways I investigated so far are:

  • Using QEMU emulation - unusable

  • Using the target platform as the development machine - slow build, but reasonable runtime performance

  • Cloud building the development container - A bit pricey, and the question of building the actual stack still remains. Maybe CMake cross compilation in native container.

  • Using Apple Silicon for development - haven’t looked into it

I’m interested in your approach of this problem. I imagine that using ARM based systems in production robots is a fairly common practice given the recent advances in this field.

3 Likes

The best approach for me so far has been to pack everything needed into a Dockerfile and let Docker compile it for the right target platform via buildx. I write my own Dockerfiles and don’t rely on the OSRF/ROS ones, and I would say that’s the best approach because you can then far more easily predict the performance of your image on your target platform. Maybe this site is a good starting point: Building Docker images for multiple operating system architectures | CircleCI

Maybe also worth tagging @robwoolley and the OpenEmbedded group, because I think they have some well-established avenues for cross-architecture builds and packaging.

Thank you for your input and the relevant mention.

Let me give context for my original post, by describing my current workflow:

During development a “development container“ gets build on the more powerful workstation. Features are either tested locally in simulation and/or on a real robot by pulling the development container and copying the install folder from the workstation machine. This allows for fast iterations even when dealing with real hardware specific things.

Upon release the whole stack gets packaged together in a container.

A main concern of mine here is the friction introduced in the development process by loosing the ability of fast iterations.

1 Like

We faced the exact same problem when targeting Raspberry Pi 5 (ARM64) and wanted to avoid QEMU and slow on-device builds too, and ended up setting up an ARM EC2 instance for cross compilation build and deployment using Docker.

Our development environment runs on VSCode devcontainers, and the targets only pull the latest ARM-based Docker image from GHCR / ECR.

The EC2 instance is stopped by default, and only wakes up to build and push to GHCR / ECR.

Works great, reduces build time, and has a very low usage cost.

The ros/meta-ros: OpenEmbedded Layers for ROS 1 and ROS 2 layer supports cross-compiling all the supported ROS releases against the OpenEmbedded releases by the Yocto Project.

This is generally done using a tool called bitbake that builds the Linux operating system and ROS in a sandbox environment designed for cross-compilation.

Many of the larger packages like clang, rust, pinocchio, pcl have intensive requirements and can’t comfortably build in parallel on a machine with even 16GB. (If needed, we can reduce the parallel jobs on a per-package basis to let them build.) There is a caching mechanism called sstate cache which is basically an archived tarball that is indexed based on a checksum of the recipe and input being used to build the package.

In order to do the 20 different combinations of ROS and Yocto, I have a Terraform script that a GitLab Runner on AWS which is able to request cheap, resource-plentiful spot instances on demand. The state cache is saved in an S3 bucket to avoid building unnecessarily.

If you just want to build your application, you don’t want to have to rebuild your OS image every time. In that case you can actually create an SDK with all the libraries and headers you want. This can be used on any Linux distribution. It supports building directly with the toolchain, Make, CMake, etc and also supports colcon. (Once can also optionally include the toolchain on the target device and do development right on the embedded device, if you prefer)

Creating devcontainers based on the SDK would be easily done as well. As of right now, we don’t publish any pre-built images or prebuilt devcontainers.

Hope that helps!

Regards,
Rob

1 Like

I am developing ros2 based applications in x86 and deploy to Nvidia DriveAGX machines. And cmake(colcon) based cross compile is OK for you x86 PC development environment.

  1. in your PC, use a x86 docker container based on ubuntu(like 20.04 for ros2 foxy)
  2. add multiarch support in your ubuntu container, so that you can install aarch64 deb
  3. install aarch64 ros2 foxy in ubuntu container
    1. you need a aarch64 ubuntu sysroot (I didn’t do it cause Nvidia officially offer the base sysroot)
    2. bring ros2-foxy-aarch64 binary under sysroot(like /opt/ros/foxy)
    3. install some aarch64 thirdparty debs(like libspdlog-dev:arm64)
  4. setup your CMakelist, and Toolchain file, so that cmake can use cross-compile cctoolchain and find_package well with aarch64 deb thirdparties
  5. use colcon wrapper to call cmake with x86 and aarch64(with your toolchain file)

the upper steps is for crosscompile applications based on ros2-aarch6-foxy(already compiled), and should get pretty better performance than qemu based methods(I think 10x sppedup compile time)

and extra benefit is, you can run aarch64 ros2 nodes in your x86 ubuntu container, because you actually have a runtime environment for aarch64 executable, and the binary can run with qumu(in your x86 container)

the key point is cmake usage when cross compile with cpp and python, and find_package with sysroot.

if you want to build ros2 itself using cmake based crosscompile, you will do much more with dependency management. Really tough

In this case, I suggest to

  1. seek public binaries in popular docker images, like osrf or nvidia issac-ros
  2. use qumu based container to build ros2 natively, and distribute the binary

reference: How to install ros2 foxy/humble on Nvidia DRIVE OS 6.0.5 - DRIVE AGX Orin / DRIVE AGX Orin General - NVIDIA Developer Forums

In my environment, mimic-cross is useful as a Docker-based cross compile environment.

It does not require us to configure complex sysroot or CMake. Just use as a docker container.

However, it does not work on Ubuntu 24.04. I am trying to fix it, but I haven’t done yet…