We would like to share a system for supervising a multi-robot SLAM fleet from inside a mixed-reality view, and gather feedback from people working on multi-robot systems and ROS–Unity integration.
The motivation: when several robots map an area at once, the operator usually monitors the fleet through a flat RViz window, which makes it difficult to track each robot’s position and how the partial maps align in space. A 2D display discards much of the spatial context the operator needs, and that cost grows with the number of robots.
In MR-SLAM, the operator wears a Meta Quest 3 in passthrough and supervises three TurtleBot3 robots simulated in Unity. The virtual robots are rendered into the operator’s real room with spatial occlusion via Meta’s MRUK, so a robot behind real furniture is correctly hidden. Each robot publishes a simulated 2D LiDAR scan (Unity raycasting, 180 rays over a 90-degree arc at 10 Hz) to ROS 2. Spatially anchored dashboard panels report per-robot coverage, scan rate, map dimensions, and latency directly in the environment.
How it is wired:
- Unity 2022 LTS application on the Quest 3, with differential-drive robot physics, raycast LiDAR, TF and clock publishing, and thumbstick teleop
- ros_tcp_connector (Unity) to ros_tcp_endpoint (ROS 2) over TCP port 10000 for bidirectional comms
- three independent SLAM Toolbox instances in asynchronous mode, one per namespaced robot
- multirobot_map_merge fusing the per-robot occupancy grids into a single merged map using known initial poses
- a small slam_stats node that publishes cell counts and map dimensions so the headset can derive coverage and scan rate locally
- ROS 2 back-end running on an ordinary Ubuntu 22.04 laptop (i5, 16 GB)
A couple of integration details that cost us time and might save others some: Unity must publish only the odom->base_footprint subtree while SLAM Toolbox owns map->odom, since letting both publish caused TF conflicts and SLAM divergence. We also hit a QoS mismatch where ros_tcp_connector’s RELIABLE clock against SLAM Toolbox’s BEST_EFFORT subscription silently dropped clock messages, which we compensated for with a clock offset on the Unity side.
Across five 9-minute sessions the system held 8.83 Hz scan delivery, 94.7% cross-instance occupancy consistency between robot pairs, and 6.3 ms median transform jitter, mapping up to 26.7 m2 of a 41 m2 grid. We present it as a reference implementation for combining passthrough MR supervision with multi-robot SLAM on consumer hardware, not as a finished product. The paper will be presented at the MM-SpatialAI workshop at ICRA 2026.
Two open questions we would welcome input on: how others maintain alignment between the headset’s spatial anchors and the ROS map frame over longer sessions without drift, and whether anyone has moved a setup like this from simulated robots onto physical ones, particularly how the map-merge step held up.
Paper: [2605.16432] MR-SLAM: Immersive Spatial Supervision for Multi-Robot Mapping via Mixed Reality
Video: https://youtu.be/Kvq74PnGZAw
Code: GitHub - prakash-aryan/MR-SLAM: Mixed reality system for supervising multi-robot SLAM: an operator on Meta Quest 3 teleoperates three simulated TurtleBot3 robots through passthrough while spatially anchored dashboards show live mapping. Unity + ROS 2 (SLAM Toolbox + multirobot_map_merge) · GitHub
If you are working on ROS–Unity MR/AR integration or multi-robot mapping, we would be glad to compare notes.