I've been reverse engineering the environment for rapid testing, what I've learned

I’ve been spending time reverse engineering the AIC simulation environment so I can build a faster, more reliable rapid-testing workflow.

Before I could even start inspecting the robot, the task board, the cable, or the observation pipeline, the very first problem was getting the environment to run correctly at all under my setup.

In my case, that meant getting the containerized environment working properly through distrobox, Docker, WSL, WSLg, and NVIDIA. That was the first major blocker. The environment was not truly usable until GPU access and the right runtime bindings were all behaving together. In practice, that meant building the distrobox carefully with explicit NVIDIA and WSL graphical passthrough instead of assuming the default container setup would “just work.”

example:

distrobox create
–name aic_eval
–image Package aic/aic_eval · GitHub
–nvidia
–additional-flags "
–gpus all
-v /usr/lib/wsl:/usr/lib/wsl
-v /usr/lib/wsl/lib:/usr/lib/wsl/lib
-v /mnt/wslg:/mnt/wslg
-e DISPLAY=$DISPLAY
-e WAYLAND_DISPLAY=$WAYLAND_DISPLAY
-e XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR
-e PULSE_SERVER=$PULSE_SERVER
-e NVIDIA_VISIBLE_DEVICES=all
-e NVIDIA_DRIVER_CAPABILITIES=all"

Once that was stable, I could finally start the real reverse engineering work.

The next lesson was that a simulation “launching” is not the same thing as a usable robotics environment. I had to verify, step by step, that the robot, controller manager, joint state broadcaster, force/torque broadcaster, observations, task board, and cable were all actually alive. A launch window means very little if the anatomy underneath it is half-dead.

Another big lesson was that the task board is more deceptive than it looks. Spawning the task board does not automatically mean you are getting a populated board with usable target hardware. By default, I found that it was easy to end up with a bare board base unless the proper component flags were passed through. Once I traced the launch files and xacro arguments, I was able to create a custom NQR bringup path that spawns a visible populated board with real components for rapid testing, without modifying the official challenge files.

I also learned that Gazebo structure and ROS TF structure are not the same thing. A component can absolutely exist in Gazebo as a model, link, or fixed joint and still not appear in ROS TF the way you expect. That was one of the most useful discoveries. The cable side exposed useful frames through the scoring TF stream, while the populated board side was clearly present in Gazebo but not automatically mirrored into the ROS TF tree in the same naming style.

Another major win was confirming the /observations stream. Once the environment was stable, I verified that it publishes a rich observation message containing stereo images, camera intrinsics, wrist wrench data, joint states, and controller state. That is huge for rapid testing because it means I do not have to scrape a dozen unrelated topics just to understand what the robot is seeing and doing.

I also learned very quickly that shutdown discipline matters. Early on, I ended up with zombie ROS nodes, duplicate graph entries, surviving relays, and stale processes that polluted later tests. Once I started treating launch, validation, and cleanup as part of the experiment itself, the environment became far more reliable.

The biggest takeaway from all of this is that rapid robotics testing is not just about writing a policy. It starts with building a trustworthy lab. For me, that meant first solving the container and GPU path, then verifying the anatomy of the sim, then confirming what exists in Gazebo versus ROS, and finally building a repeatable custom bringup workflow for NQR-style rapid testing.

Right now, I’m in a much better place than when I started. I have a simulation that boots cleanly, a robot with active controllers and sensors, a populated test board, a live observation stream, and a much clearer understanding of how the environment is actually wired together.

5 Likes

@No_Quarter_Robotics Huge props for this deep dive! Documenting your systematic approach and findings for others truly exemplifies the spirit of the competition!

1 Like

Thanks! I thought I should share my struggles! Truthfully, I almost gave up a few times, I never really have need to work with docker and containerization. So I didn’t think I could stick with the guidelines for submissions. Plus, I just learned ROS a few weeks before the challenge, so I am still a newbie to all this stuff!

3 Likes