[Release] GerdsenAI's Depth Anything 3 ROS2 Wrapper with Real-time TensorRT for Jetson

Update: TensorRT Optimization, 7x Performance Improvement Over Previous PyTorch Release!

Great news for everyone following this project! We’ve successfully implemented TensorRT 10.3 acceleration, and the results are significant:

Performance Improvement

Metric Before (PyTorch) After (TensorRT) Improvement
FPS 6.35 43+ 6.8x faster
Inference Time 153ms ~23ms 6.6x faster
GPU Utilization 35-69% 85%+ More efficient

Test Platform: Jetson Orin NX 16GB (Seeed reComputer J4012), JetPack 6.2, TensorRT 10.3

Key Technical Achievement: Host-Container Split Architecture

We solved a significant Jetson deployment challenge - TensorRT Python bindings are broken in current Jetson container images (dusty-nv/jetson-containers#714). Our solution:

HOST (JetPack 6.x)
+--------------------------------------------------+
|  TRT Inference Service (trt_inference_shm.py)    |
|  - TensorRT 10.3, ~15ms inference                |
+--------------------------------------------------+
                    ↑
                    | /dev/shm/da3 (shared memory, ~8ms IPC)
                    ↓
+--------------------------------------------------+
|  Docker Container (ROS2 Humble)                  |
|  - Camera drivers, depth publisher               |
+--------------------------------------------------+

This architecture enables real-time TensorRT inference while keeping ROS2 in a clean container environment.

One-Click Demo

git clone https://github.com/GerdsenAI/GerdsenAI-Depth-Anything-3-ROS2-Wrapper.git
cd GerdsenAI-Depth-Anything-3-ROS2-Wrapper
./run.sh

First run takes ~15-20 minutes (Docker build + TensorRT engine). Subsequent runs start in ~10 seconds.

Compared to Other Implementations

We’re aware of ika-rwth-aachen/ros2-depth-anything-v3-trt which achieves 50 FPS on desktop RTX 6000. Our focus is different:

  • Embedded-first: Optimized for Jetson deployment challenges
  • Container-friendly: Works around broken TRT bindings in Jetson images
  • Production-ready: One-click deployment, auto-dependency installation

Call for Contributors

We’re looking for help with:

  • Test coverage for SharedMemory/TensorRT code paths
  • Validation on other Jetson platforms (AGX Orin, Orin Nano)
  • Point cloud generation (currently depth-only)

Repo: GitHub - GerdsenAI/GerdsenAI-Depth-Anything-3-ROS2-Wrapper: ROS2 wrapper for Depth Anything 3 (https://github.com/ByteDance-Seed/Depth-Anything-3)
License: MIT

@Phocidae @AljazJus - the TensorRT optimization should help significantly with your projects! Let me know if you run into any issues.

1 Like