Hi all,
I’m releasing the first public version of RosBag Resurrector today — an open-source (MIT) Python library + web dashboard for analyzing MCAP and ROS 2 bag files. No ROS installation required.
The core idea: treat a bag like a pandas DataFrame. Open it, get column-style access to any topic, do whatever filter / transform / export you’d normally do with tabular data. The intent is to fill the gap between “raw bag file on disk” and “ML training dataset” without forcing you to write throwaway scripts every time.
python
from resurrector import BagFrame
bf = BagFrame(“experiment.mcap”)
bf.info() # rich summary
df = bf[“/joint_states”].to_polars() # any topic → DataFrame
synced = bf.sync([“/imu/data”, “/joint_states”], method="nearest", tolerance_ms=50)
bf.health_report() # quality score
bf.export(topics=[…], format=“lerobot”, output="training_data/")
Full feature list:
- `BagFrame` API — pandas/Polars-like access to any topic. Lazy by default; chunked iteration for large topics; `materialize_ipc_cache()` for filter/projection pushdown via Polars LazyFrame.
- Health validation — automatic 0–100 score per bag, detecting dropped messages, time gaps, out-of-order timestamps, message size anomalies. Per-platform threshold configuration.
- Multi-stream sync — nearest / interpolate / sample-and-hold methods with explicit tolerance, anchor-topic, out-of-order, and boundary policies. Streaming engine when bags are large; eager when they fit in memory; `engine=“auto”` picks for you.
- ML-ready export — Parquet, HDF5, CSV, NumPy, Zarr, plus **LeRobot** and **RLDS** for direct use in robot-learning training pipelines. Streamed chunk-by-chunk so large topics don’t OOM.
- Semantic frame search — CLIP embeddings indexed into DuckDB. Query video content with plain English (`resurrector search-frames “robot arm collision”`). Available in the dashboard with thumbnail results.
- PlotJuggler-compatible bridge— WebSocket relay from any recorded bag at configurable speed (0.1×–20×) or live ROS 2 topic relay (rclpy-based).
- Web dashboard — Library, Explorer (Plotly with brush-to-zoom, linked cursors, click-to-annotate), Health, Compare, Cross-bag overlay, Search, Datasets, Bridge. Runs at `localhost:8080`.
- Reproducible datasets — versioned dataset collections with SHA256 manifests + auto-generated READMEs.
- Memory bounded by chunk size, not bag size — verified by a regression test on a 10M-message synthetic bag.
- 18 runnable example scripts under `examples/` covering every feature, each <10 seconds against an auto-generated sample bag.
Formats: MCAP is the optimized primary path (ROS 2 default since Iron). Legacy `.bag` and `.db3` auto-convert via the official `mcap` and `ros2 bag convert` CLIs — no parser maintenance from us.
Quick try:
pip install rosbag-resurrector
resurrector doctor
resurrector demo --full
resurrector dashboard
GitHub: https://github.com/vikramnagashoka/rosbag-resurrector
This is a brand-new public release — would genuinely appreciate feedback from the community. The two questions I most want answered:
1. Which post-recording bag workflows are you writing one-off Python scripts for right now? Those are the use cases I want to prioritize next.
2. Are there bag-related pain points you’ve already given up on solving? I’d love to hear the “I wish a tool just did X” wishes.
Bug reports and feature requests welcome via GitHub issues.