Native buffer type

cyc · November 6, 2025, 7:08pm

Physical AI requires DNN inference for learned policies, which in turn requires accelerators. Accelerators have their own memory and compute models that need to be surfaced in ROS 2 under abstractions, similar to how tensors are surfaced in PyTorch (accelerator aware, accelerator agnostic). This abstraction would need to be available at all layers of the ROS stack (client libraries, IDL, rmw), be vendor agnostic (CUDA, ROCm, etc.), allow for runtime graphs of heterogeneous accelerators, and enable RMW implementations to handle transport of externally managed memory efficiently. Developers who implement these concepts in their packages should have CPU backwards compatibility when specified accelerators are not available at runtime.

We propose forming a working group with other vendors hosted by the ROS PMC to introduce the concept of externally managed memory and asynchronous compute that enables accelerated graphs into ROS 2 Lyrical. Tensor semantics and DNN inference standards layered on top of what is proposed here would be designed by the Physical AI SIG.

Our design sketch is a more targeted native buffer type that maps to supplied implementations in client libraries, like rclcpp::buffer. This native type only represents a memory handle for a block that could optionally be managed externally.

namespace rclcpp { 
class buffer {
  protected:
    std::unique_ptr<BufferImplBase> impl;
    std::string device_type;
};
}  // namespace rclcpp

The client library interface does not expose its underlying buffer directly, but manages all access through vendored interfaces that add support for particular frameworks or hardware architectures. For example, an implementation for Torch in a hypothetical torch_support library as shown in the example below.

By doing so, buffer is a more fundamental type that is focused on data storage abstraction, while semantics like tensors or image buffers can then be layered on top of it.

# MessageWithTensor.msg
#
# a message containing only a buffer that is to be interpreted as a tensor

buffer tensor

// sample callback that receives a messages containing a 
// buffer, interprets it as a tensor, performs an operation 
// on it, and publishes a new message with the output, with 
// all operations performed in the Torch-chosen accelerator
// backend
void topic_callback(const msg::MessageWithTensor & input_msg) {
    torch::Tensor input_tensor =
        torch_support::from_buffer(input_msg.tensor);

    auto result = input_tensor.some_operation();

    auto output_msg = msg::MessageWithTensor();
    output_msg.tensor = torch_support::to_buffer(result);

    publisher_.publish(output_msg);
}

A default implementation for CPU backed buffers would be provided as part of the base ROS distribution, while system vendors and framework designers would provide implementations for their respective memory types. All custom implementations would always provide support to convert to and from CPU backed buffers, such that compatibility across implementations is guaranteed.

Relevant tensor type discussion can be found in the other post here: Native rcl::tensor type

Topic		Replies	Views
Native rcl::tensor type ROS General ros2	9	743	September 4, 2025
The ROS 2 Hardware Acceleration Stack ROS General ros2 , hardware , fpga , wg-acceleration , gpu	4	2039	July 27, 2022
Challenges of GPU acceleration in ROS ROS General jetson	19	8727	April 29, 2021
Heap Allocator in ROS ROS General	1	972	September 14, 2017
REP-2008 RFC - ROS 2 Hardware Acceleration Architecture and Conventions ROS General ros2 , hardware , fpga , wg-acceleration , gpu	8	5800	October 2, 2021

Native buffer type

Related topics