Requesting use-case ideas: dataset DataOps for rosbag→images→COCO/YOLO training

Hi everyone — I’m looking for real-world application scenarios where a small “dataset DataOps” tool would be genuinely useful in robotics perception workflows.

I often see teams doing some version of: rosbag → images → COCO/YOLO → training (or sim → export → training). The painful part is usually not exporting once, but iterating safely:

  • “Which exact dataset did we train on?”

  • “Why did results change after a config tweak?”

  • “Did the dataset distribution drift?”

  • “Can we put dataset quality checks into CI?”

I’m building an open-source CLI called KomanSim that focuses on:

  • Reproducible dataset builds: each run writes a manifest.json with config/job hash + dataset hash

  • Exports: COCO + Ultralytics-style YOLO layout

  • QA outputs: basic dataset stats/sanity checks intended to be used like “quality gates”

Right now it’s intentionally lightweight: the included dummy backend is mainly for validating the pipeline/output contracts (images may be placeholders), not photorealism.

Quickstart (copy/paste)

git clone https://github.com/uptonow/KomanSim
cd KomanSim
pip install -e ".[dev]"

# COCO demo
komansim validate --config examples/configs/job_dummy_coco.yaml
komansim run --backend dummy --config examples/configs/job_dummy_coco.yaml

# YOLO demo (Ultralytics layout)
komansim validate --config examples/configs/job_dummy_yolo.yaml
komansim run --backend dummy --config examples/configs/job_dummy_yolo.yaml

What I’m asking from you

  1. In your work, where does the workflow rosbag→images→COCO/YOLO break down the most?

  2. Which “next feature” would you actually use?

  • A) Dataset Doctor: QA an existing COCO/YOLO dataset and output a report + suggested quality gates

  • B) Dataset Diff: compare dataset v1 vs v2 and report distribution/annotation differences (drift signals, missing labels, bbox size/occlusion changes)

  • C) ROS-friendly helpers: small utilities around frame export / camera_info calibration awareness / consistent naming

If you have 30 seconds: reply with one concrete scenario (even a single sentence), and optionally A/B/C.
If you have 2 minutes: what QA metrics or diff signals would you want first?

Repo: https://github.com/uptonow/KomanSim

Thanks

1 Like