Saw this post recently, debating that Docker is not the right tool for robotics. It sparked a good discussion and got me thinking.
I am curious to know what people are actually using in real world deployments and over-the-air updates. Docker, Podman, Snap, Yocto, something else entirely? And does it change between dev and production?
I don’t really have any alternatives besides use something like Ansible to configure your robot computer(s) without any containers/VMs.
I use Docker a lot, but not in a product setting. For developer workflows it’s totally fine, though I’ve been going more into Pixi for those use cases lately. Less isolation than containers, but it’s again fine for dev.
Let me demystify docker for you. It only takes a minute.
Do this:
đźž‚ unshare -rm
# Congratulations! you've just created your own container!
đźž‚ whoami
root
# Wait what? well, don't get your hopes up. You are only root in
# here, in your container
đźž‚ mkdir -p myroot/usr tmp1/usr/{work,merge}
đźž‚ mount -t overlay overlay -olowerdir=/usr/,upperdir=myroot/usr,workdir=tmp1/usr/work tmp1/usr/merge
# Creates an overlay mount over /usr. This combines everything
# from /usr with everything in myroot/usr where the later trumps
# the former. You can see the result in the merge:
đźž‚ ls tmp1/usr/merge/
bin games include lib lib32 lib64 libexec libx32 local sbin share src
# Let's play pretend!
đźž‚ mount --rbind tmp1/usr/merge/ /usr/
đźž‚ cd /usr
đźž‚ touch i_am_not_really_here
# Creates a file in /usr -- wait, how? We are not actually root
# on the host! Right, this is just pretending. We are actually
# creating the file in myroot/usr
đźž‚ ls /usr/
bin games i_am_not_really_here include lib lib32 lib64 libexec libx32 local sbin share src
# Looks real, feels real! All your programs in this container will
# think it's real!
# let's wake up!
đźž‚ exit
đźž‚ ls /usr/
bin games include lib lib32 lib64 libexec libx32 local sbin share src
# ok, we dreamed it -- it's not actually there.
đźž‚ ls myroot/usr/
i_am_not_really_here
# But here it is. Not lost. Ready to be used again.
This is powerful, because you can “install” software in your container “on top of” your host. This means, e.g., that you can share your apt installed ROS distribution in your containers and can keep the container image absolutely tiny. In particular you don’t need to copy an entire ubuntu filetree + ROS distro + whatever else. This is insanely light-weight and no sudo required. It’s what we use for sandboxing robot capabilities in Transitive.
Thanks for letting me enlighten you. Now go forth and create your own containers!
(Hint: the next logical step is to read man unshare!)
Thank you for sharing the article! I will go through it, seems interesting.
Yes, I also use Docker for all my development workflows. Pixi is gaining traction these days.
But it seems, like in this article and the LinkedIn post, Docker may not be the best for production environment.
This is interesting! Essentially the working principle behind Docker / Containerization.
I see this could work well in development, but we may need to implement many more modules over it for publishing updates, and fleet management, I think.
I’m surprised nobody mentioned Apptainer. It is like Docker but without the permissions and networking hell by default. Exactly what some people mentioned in the LinkedIn discussion: robotics needs environment isolation, not total isolation.
You can still drop any privileges you want once you are sure with your setup. Apptainer feels like running Docker with --privileged, but you’re still running as a non-root user.
One thing Apptainer is bad at is OTA updates. I don’t think there’s any support for that. Bit I haven’t dug into this, so maybe it’s I just don’t know about the solutions.
But regardless of which container tech you choose, you still need to configure the underlying OS (networking, udev, kernel parameters, sysfs and such, which no container could help with). We use Ansible for that and it is sufficient, but we only manage a few lab robots, not a fleet of products.
Jumping on this thread because I think I’m the weird one here, I haven’t touched ROS outside of a container since the Noetic days.
That said, my pain was never the container itself. It was everything around it. My loop looked like this: develop locally, test in sim, push to branch, SSH in, pull, rosdep install, colcon build, run. Best case 10 minutes. Cross-compiling for ARM? Pour a coffee.
I started writing Ansible playbooks to automate the transfer side, that grew, got opinionated, and eventually became forge, a tool I use every day now.
The whole system lives in a single declarative YAML — hosts, components, dependencies, middleware config. Profiles are just different files: forge simulation.yaml launch spins up Gazebo locally, forge robot.yaml launch pushes to hardware.
prep builds a base image with your shared packages compiled into /ros_ws_common
stage generates per-component Dockerfiles via Jinja2 templates, imports VCS
repos with vcstool, runs rosdep, and layers each workspace on top of the base
(/vcs_ws for remote repos, /ros_ws for local source). BuildKit cache mounts
keep apt, pip, and rosdep warm between runs
build compiles by mounting your workspace into the staged container, so
artifacts stay on disk rather than baked into the image
launch rsyncs only changed artifacts to each host and brings up the compose
stack over SSH via python-on-whales
For context, on my setup (M4 Mac building for a Jetson Orin):
base image takes about 4 minutes
cold staging all components around 3m30s. (But you only do that once)
warm cache staging is 1m28s for all components, for just one it’s a few seconds.
The everyday build + launch loop is under a minute unless you’re touching dependencies.