Hi! Today, me and my team is releasing a version of Cosmos-Reason2-2B that is quantized so that it fits even on the NVIDIA Jetson Orin Nano Super.
We managed to find a mixed precision configuration such that it maintains virtually the same accuracy as the unquantized model while being able to run really efficiently on the Nano Super and other edge devices!
Would love to get feedback if you try it! ![]()
