Greetings fellow roboticists,
you might know me from other projects such as ros_babel_fish, the QML ROS 2 module, or RQml.
Maybe as the dude who ends all their posts with an AI-generated image.
TLDR: ros_camera_server out now (link below). Efficient streaming to ROS, robust low-bandwidth to operator station over H.264/H.265 using rtp/srt/webrtc. Simple config, automatically optimized pipelines.
Camera streaming in ROS is kind of cumbersome, even though it’s always the same issue.
On the robot, you want raw (or compressed between PCs) ROS images for processing, and on the remote operator station, you want a low-latency, ideally low-bandwidth live video feed.
In academia, many setups I have encountered use compressed images for remote operator setups because it’s simple and, in good network conditions, works well enough.
Using usb_cam or gscam with compressed images is also what Opus 4.7 would suggest when asked.
While this does work, it’s quite resource- and bandwidth-intensive and is one of the roadblocks that need to be addressed to bring research solutions to the field, especially in rescue robotics, where bandwidth is a limited resource.
To save bandwidth, you can use something like gst_bridge to receive the ROS image, encode it in H.264, and send it to your remote operator station.
This will be approximately 1/10 to 1/30 of the bandwidth for comparable quality.
If your camera is a high-resolution USB camera, it will most likely stream jpeg encoded data for the high resolution, high fps options.
So your pipeline becomes:
Camera (jpeg) → usb_cam (decodes jpeg) → image_transport (re-encodes as jpeg for compressed) → gst_bridge → handwritten gstreamer pipeline (requires some technical knowledge to get right) → stream
Doesn’t take an expert to see that this is not optimal.
Here’s where my new ros_camera_server comes in.
You specify one yaml file with your cameras, each with one input and as many outputs as you want (and your compute can handle).
The outputs can differ in resolution and framerate.
Currently supported are ROS 2, RTP, SRT, and WebRTC.
The camera server will automatically create and optimize GStreamer pipelines based on your available hardware accelerations, which, in parallel, produce your ROS output and streams applying scale and framerate limiters as necessary.
Cutting the decode and re-encode overheads and significantly reducing latency and CPU usage.
JPEG camera input can also be published directly as a ROS-compressed image or forced to be decoded if needed.
Check the plots from my benchmark in the comments to see that the much easier configuration is not paid for with higher latency or overhead, and it beats the alternatives in both.
Here’s the repo:
If you can’t comply with the AGPL, you can contact me to see if we can find a suitable license for your use.
PS: The ros_camera_server preserves the image capture/header timestamp as a custom RTP header extension and can restore it from ros_camera_server H.264/H.265 streams.
So you can stream from the robot over RTP/SRT/WebRTC and restore it to ROS on the operator station or another robot, and the timestamp will be preserved in the ROS image output.
Benchmarks
I hope this helps groups without video streaming experts to create more robust remote control setups.
If you read this far and this was not of interest to you, I’m sorry, here’s your AI picture:



