Hello ROS community
When working with Lidar data users are usually referred to using PointCloud2 objects that represent the lidar data as a list of 3d points with additional attributes. While this nicely mirrors the PCL representation and fits the majority of applications working with 3D point cloud data this isn’t how modern lidar sensors represent data.
Problem Statement
This has several drawbacks that are highlighted in the following list (not a comprehensive list):
-
With the rapid increase in Lidar resolution, PointCloud2 can be hefty to transport..To this date many DDS implementations struggle to catch up with the actual sensor frame rate when transporting a high resolution PointCloud2 on low to medium compute nodes.
-
An option to reduce the bandwidth requirement would be to use dense pointclouds i.e. transport only valid points. However, by doing so we lose the structured nature of Lidar data of devices that natively generated the Lidar data in a structured 2D grid.
-
Many image processing operations do benefit from the adjacency information allowing for quick lookup of neighboring pixels. For example, Ground Plane Removal can be more efficient to implement directly on 2D range data than using a 3D representation.
-
One could also directly employ existing 2D neural networks like YOLO on Lidar data in their 2D representation.
-
-
A potential critique to this suggestion might be that we don’t need a new message to do this as we already have sensor_msgs::Image that can fulfill this aspect for users who need it to be that way. And in fact, the ouster_ros driver -optionally- does publish the range data and other byproducts of the sensor as sensor_msgs::Image on separate topics. I am aware of many users who do indeed utilize these topics instead of the 3d pointcloud data.
-
While this works fine if you are only interested in processing each channel individually. This breaks down a bit if you need to access and use more than one channel in the same operation - which is often the case -.
- A simple example of this, a user may want to filter certain returns (range data) based on reflectivity values and adjacency data simultaneously.
-
A common approach to this problem in ROS would be to use the `ApproximateTime` filter. Doing so, however, adds some latency and CPU overhead to synchronize data channels that were originally already synchronized.
-
A LidarScan message acts here as a multi-spectral image with all channels are memory aligned with 100% data correlation ensured by the sensor without software sync overhead.
-
The proposal
We are proposing the addition of a new ROS sensor message that mirror that native format of the majority of Lidar sensors (whether spinning or solid-state), in this proposal we would like to invite other lidar vendors to also contribute to make sure that this format encompasses the entire spectrum of Lidar sensors.
A quick draft of a LidarScan message could look like this:
std_msgs/Header header
# Dimensions of the scan (e.g., 128 channels x 2048 columns)
uint32 height
uint32 width
# --- Geometry Metadata ---
# Horizontal and Vertical FOV/Resolution info to allow projection to 3D
# without needing a full PointCloud2 blob.
float32 vertical_fov_min
float32 vertical_fov_max
float32 horizontal_fov_min
float32 horizontal_fov_max
# --- Channel Data (The "Image" approach) ---
# Each channel (Range, Intensity, Reflectivity, etc.) is stored in this list.
# This mirrors the 'PointField' logic but at a 2d-grid level.
LidarChannel[] channels
# The actual raw buffer containing all interleaved or planar channel data.
# Using uint8[] allows for Zero-Copy compatibility.
uint8[] data
# --- Scaling and Metrics ---
# Different venders have different units
# Ouster (mm) vs. Velodyne (m) vs. Hesai (cm) problem.
# Range = (raw_value * multiplier) + offset
float64 range_multiplier
float64 range_offset
And the definition of LidarChannel:
string name # "range", "intensity", "reflectivity", "ambient", "near_ir"
uint32 offset # Offset from start of data row
uint8 datatype # uint8, uint16, uint32, float32, etc.
While this works for sensors with uniform distributions of laser beams not all vendors have that formation including Ouster sensors making the section on Geometry Metadata insufficient:
float32 vertical_fov_min
float32 vertical_fov_max
float32 horizontal_fov_min
float32 horizontal_fov_max
Ouster spinning sensors vertical beams don’t have uniform distribution due to the calibration process. Which means we need to extend the previous definition to include the beam angels in the LidarScan message body:
# --- Non-Uniform Geometry Metadata ---
# These arrays allow the receiver to project Range -> 3D.
# vertical_angles[height]: The elevation angle for each ring (in radians).
float32[] vertical_angles
# other attributes might also be needed
# horizontal_angles[width]: [optional] The azimuth angle for each column (in radians).
# int32[] beam_time_offset: [optional] To handle "staggered" firing patterns within a single column.
This solves the problem and allows users to project the range data into 3D but adds a bit of overhead increasing the message size. These arrays basically essentially define the intrinsics of the Lidar sensor, however, transporting them with every LidarScan message reduces or eliminates most of the gains attained by transporting the raw range data vs using the projected xyz points. A better approach would be to break down the beam information and the lidar data into two separate messages in which the lidar sensor info is only transported once earlier during the connection phase. This is not a new pattern to ROS as it already has `sensor_msgs/CameraInfo` which describes the intrinsics of the camera link: sensor_msgs/CameraInfo Documentation .
By moving these intrinsic values fields into a separate message we can retain the same gains and keep the LidarScan message lean. The definition for a sensor_msgs::LidarInfo message would be something like:
std_msgs/Header header
float32[] vertical_angles
float32[] horizontal_angles
int32[] beam_time_offsets
# --- Scaling and Metrics ---
float64 range_multiplier
float64 range_offset
# Plus other static factory data (intrinsic/extrinsic)
And the revised LidarScan message becomes:
std_msgs/Header header
uint32 height
uint32 width
LidarChannel[] channels
uint8[] data
NOTES:
-
This format is more suited for filtering and perception stacks
-
It is important to note that this proposal does not suggest that vendors of Lidar sensors or users should stop using the PointCloud2. It mainly suggests the addition of a new message type that mirrors the native format of the majority of Lidar sensors, reducing overhead and providing better synchrony.
-
The idea here is to come up with a standard sensor_msgs::LidarScan and sensor_msgs::LidarInfo messages .. and totally abstract out the process that converts this native Lidar sensor message format and produce the 3D pointcloud out of any sensor.
-
Once we get initial feedback from the community the idea is that Ouster and others who are interested in this concept to build a PoC of the proposal and make sure we cover the all basic necessities for this to work before committing to the final interface.
-
I am also aware of the other proposals around Native Buffers (rcl::Buffer) that are already in flight and we plan to support this from the get go as there are large intersection with the motivation behind Native Buffers and the use of LidarScan for perception type of tasks and other workloads
