[GSoC 2025] Migrating and Optimizing DAVE’s Physics-based Sonar Plugin for ROS 2 and Gazebo Harmonic
Organization: Open Robotics
Mentors: Woen-Sug Choi, Rakesh Vivekanandan
Student: Helena Bianchi Moyen (Github, LinkedIn)
Link to GSoC project: https://summerofcode.withgoogle.com/programs/2025/projects/d2fAACfv
Hello everyone,
This summer, I was selected to work on migrating DAVE’s Multibeam Sonar Plugin to the new Gazebo Harmonic and ROS 2 Jazzy, under the guidance of mentors Woen-Sug Choi and Rakesh Vivekanandan, with support from community volunteers Gaurav Kumar and Achille Martin. Over the past four months, we worked together to create demos, test, and optimize not only the Multibeam Sonar Plugin—originally developed for ROS 1 Noetic and Gazebo Classic—but also DAVE demos and other components needed for the migration.
In this article, I will focus on my role, which involved porting the old Multibeam Sonar Plugin and optimizing its CUDA calculations to reduce execution time as much as possible without compromising precision. This continues the work developed during GSoC 2024.
About the project
DAVE (Aquatic Virtual Environment - Home | Project DAVE) is a simulation platform designed for the rapid testing and evaluation of underwater robotic solutions, specifically autonomous underwater vehicles (AUVs/UUVs) that perform missions involving autonomous manipulation. Originally built on ROS 1 Noetic and Gazebo Classic—which reached their end of life in 2025—the goal of this project was to migrate its Multibeam Sonar plugin to ROS 2 and the Harmonic framework. This transition ensures continued support and development for the simulation environment.
The Multibeam Sonar Plugin
The DAVE multibeam sonar plugin uses a ray-based multibeam model that simulates phase, reverberation, and speckle noise through a point scattering approach. It generates realistic intensity-range (A-plot) data while accounting for time and angular ambiguities as well as speckle noise.
Key Features
- Physical sonar beam/ray calculation with the point scattering model
- Generating intensity-range (A-plot) raw sonar data
- Publishes the data with UW APL’s sonar image msg format
- NVIDIA CUDA core GPU parallelization
The diagram below illustrates the structure of the plugin and how CUDA is used to perform the sonar calculations. Functions highlighted in green indicate the modified or newly added code compared to the original CUDA implementation.
Original Research paper
- Choi, W., Olson, D., Davis, D., Zhang,
M., Racson, A., Bingham, B. S., … & Herman, J. Physics-based
modelling and simulation of Multibeam Echosounder perception for
Autonomous Underwater Manipulation. Frontiers in Robotics and AI, 279. 10.3389/frobt.2021.706646
Original DAVE Wiki
Migration to ROS 2 and Gazebo Harmonic
The migration was largely based on the DVL system and custom sensor code, which we combined with Gazebo’s GPU Lidar to generate a point cloud. This point cloud is then converted into a depth image, which, together with parameters defined in the sonar SDF—is as horizontal and vertical FOV, sound speed, number of beams, and number of rays—was used to compute the sonar image. Compared to the old code, which used a depth camera or a Gazebo GPU ray sensor, the new version is a custom Gazebo sensor.

- Documentation on how to use the Multibeam Sonar sensor and demos
- PR: Multibeam Sonar Migration Template by woensug-choi · Pull Request #15 · IOES-Lab/dave · GitHub
CUDA Code Optimization
The second part of the project focused on reducing the execution time of the sonar calculations performed for each sonar frame. This was achieved primarily by using NVIDIA profiling tools to identify redundancies and unnecessary memory transfers between the CPU and GPU, as well as leveraging GPU intrinsic functions and CUDA libraries such as cuBLAS. Many optimizations followed tips from Steven I. Reeves’ notes. The largest gains came from changing the Ray Summation and reduction step, reducing memory copies and stall latency, and implementing sequential addressing for the reduction. Overall, the wrapper achieved a 12.6× speedup on the NVIDIA GeForce MX330, reducing the average execution time from 434.71 ms to 34.47 ms.
- Documentation on how the optimization was done: Accelerating Ray-Based Multibeam Sonar Simulation in ROS 2 with CUDA
- PR: Cuda acceleration by hmoyen · Pull Request #29 · IOES-Lab/dave · GitHub
Challenges and implementation
Here are some of the migration challenges we’ve encountered, including those related to CUDA:
- Creating a custom sensor: Using the DVL source code as an example of a custom sensor was very helpful for developing our own plugin. It allowed a top-down approach, starting from the full implementation and then stripping out parts that were not needed for our application.
- Debugging CUDA code: A crucial step in optimizing the code was identifying the parts that consumed the most time and understanding how often and how much memory was being transferred. Initially, many “debug” prints were added, but these were insufficient to fully understand what was happening in the background. Using NVIDIA’s nsys profiling tools and statistics made it much easier to find the bottlenecks. The NVIDIA CUDA Programming Guide and Steven I. Reeves’ notes were also extremely helpful in guiding the optimization process. It was also important to isolate the CUDA code for this analysis, avoiding the need to launch a full simulation every time it was tested. Once the isolated code worked correctly, it could be integrated to observe its effect on the sonar image.
Conclusion
The project goals were achieved, with the Multibeam Sonar migrated to ROS 2 and Gazebo Harmonic, alongside improvements to the CUDA calculations.
For further improvement, I would point to this GSoC 2025 project presentation from Shashank Rao, which could be useful for exploring additional optimizations for the sonar plugin. One approach that was considered too was the use of half-precision, which might accelerate performance on NVIDIA GPUs with Tensor Cores. Another interesting direction would be to use OpenCV’s GpuMat class to store depth images (and others such as the normal image), thereby eliminating unnecessary CPU–GPU memory copies in the CUDA code.
I would like to thank my mentors @woensug-choi and @rakeshv24 for their guidance throughout these months, as well as @GauravKumar9920 and Achille for the collaboration both this summer and last year. Contributing to the DAVE project and to Open Robotics has been a huge learning for me, and I’m very grateful for the opportunity.
Special thanks also to @Katherine_Scott for her support and communication during the project. It has been amazing to be part of the Gazebo Community, to learn about other projects and connect with great people.
