6600 demos recorded uplaoded on hugging face

Hey guys our team reached 35 on the leaderboard, unfortunately couldnt make it to the next phase.
sharing our collected data across diverse trials for both sfp and sc demos.
Hope this helps the teams in their next phases
Had an amazing time during the competition
Link to the post:

4 Likes
3 Likes

Which model did you used? Can you share your model?

Hi Bha51 and discourse members,

Now that the AIC challenge is over, I am working on improving the evaluation scores for the three trials.

For that, I am training a diffusion policy on this merged DAgger dataset. For a model trained on 50k steps, it scores 9 on the local eval across 3 trials, which is significantly worse than my earlier ACT model (around 60). The notable failure mode: during SFP trials, the robot heads toward the SC port instead of the NIC card.

So far I’ve checked these failure points. Actions are absolute pose targets rather than deltas. The task target fields in the state vector (cable type, rail, port) are being filled correctly at inference. So I think the adapter I wrote looks fine, but it’s still not working.

A few questions if anyone has insight:

Roughly how many training steps does a diffusion policy need before AIC eval scores start being meaningful? Trying to check whether 50k is just too early.

Has anyone seen a model default to the wrong cable type during early training, and was it eventually fixed by more steps or by some specific change?

The data is roughly 60% SFP, 40% SC. Worth rebalancing, or training two separate models per cable type?

I’m on diffusers 0.38, but LeRobot pins it to <0.36. Has the version mismatch caused subtle issues for anyone?

When things in the adapter look fine but the model still doesn’t work at an early checkpoint, is it usually just because undertraining, or is there some sort of integration bug that happens?

Thanks,
Pranav