[Case Study] Cross-Morphology Policy Learning with UniVLA and PiPER Robotic Arm

We’d like to share a recent research project where our AgileX Robotics PiPER 6-DOF robotic arm was used to validate UniVLA, a novel cross-morphology policy learning framework developed by the University of Hong Kong and OpenDriveLab.

Paper: Learning to Act Anywhere with Task-Centric Latent Actions
arXiv: [2505.06111] UniVLA: Learning to Act Anywhere with Task-centric Latent Actions
Code: GitHub - OpenDriveLab/UniVLA: [RSS 2025] Learning to Act Anywhere with Task-centric Latent Actions


Motivation

Transferring robot policies across platforms and environments is difficult due to:

  • High dependence on manually annotated action data
  • Poor generalization between different robot morphologies
  • Visual noise (camera motion, background movement) causing instability

UniVLA addresses this by learning latent action representations from videos, without relying on action labels.


Framework Overview

UniVLA introduces a task-centric, latent action space for general-purpose policy learning. Key features include:

  • Cross-hardware and cross-environment transfer via a unified latent space
  • Unsupervised pretraining from video data
  • Lightweight decoder for efficient deploymen

Figure2: Overview of the UniVLA framework. Visual-language features from third-view RGB and task instruction are tokenized and passed through an auto-regressive transformer, generating latent actions which are decoded into executable actions across heterogeneous robot morphologies.


PiPER in Real-World Experiments

To validate UniVLA’s transferability, the researchers selected the AgileX PiPER robotic arm as the real-world testing platform.

Tasks tested:

  1. Store a screwdriver
  2. Clean a cutting board
  3. Fold a towel twice
  4. Stack the Tower of Hanoi

These tasks evaluate perception, tool use, non-rigid manipulation, and semantic understanding.


Experimental Results

  • Average performance improved by 36.7% over baseline models
  • Up to 86.7% success rate on semantic tasks (e.g., Tower of Hanoi)
  • Fine-tuned with only 20–80 demonstrations per task
  • Evaluated using a step-by-step scoring system


About PiPER

PiPER is a 6-DOF lightweight robotic arm developed by AgileX Robotics. Its compact structure, ROS support, and flexible integration make it ideal for research in manipulation, teleoperation, and multimodal learning.

Learn more: PiPER
Company website: https://global.agilex.ai

Click the link below to watch the experiment video using PIPER:


Collaborate with Us

At AgileX Robotics, we work closely with universities and labs to support cutting-edge research. If you’re building on topics like transferable policies, manipulation learning, or vision-language robotics, we’re open to collaborations.

Let’s advance embodied intelligence—together.