Hi everyone,
we’d like to share a small project using the AgileX PiPER robotic arm combined with a depth camera to achieve simple teleoperation.
The core idea is to use hand gesture recognition to control the end effector position of PiPER (only Cartesian position, not orientation).
This is a lightweight, low-cost demo that may help inspire gesture-based human-robot interaction projects.
Abstract
This repository implements a low-cost, simple teleoperation function using a depth camera (position control only).
Tags: Gesture Recognition, Coordinate Mapping, AgileX PiPER
Code Repository: AgileX-College GitHub
Function Demonstration:
Pre-use Preparation
Hardware
- Orbbec Petrel (aligned depth + RGB, 640×400@30fps)
- (Optional) Intel RealSense D435 (aligned depth + RGB, 640×480@30fps)
- AgileX Robotics PiPER robotic arm
Software Environment
- Install required ROS packages:
sudo apt install -y ros-noetic-sensor-msgs ros-noetic-image-transport ros-noetic-cv-bridge ros-noetic-vision-msgs ros-noetic-image-geometry ros-noetic-pcl-conversions ros-noetic-pcl-ros ros-noetic-message-filters
- Install gesture recognition library:
pip3 install mediapipe
- Clone & build:
cd your_ws/src
git clone https://github.com/agilexrobotics/AgileX-College.git
cd ..
catkin_make
source devel/setup.bash
Operation Steps
1. Start the Robotic Arm
Follow the PiPER driver deployment:
Insert CAN module → find bus → activate:
./find_all_can_port.sh
./can_activate.sh
Launch driver:
roslaunch piper start_single_piper.launch
Start IK library (pinocchio):
python piper_pinocchio.py
Start camera (example: D435):
roslaunch realsense2_camera rs_aligned_depth.launch
Start gesture detection:
rosrun handpose_det handpose_det.py
2. Start Teleoperation
Define custom Home position (different from driver’s built-in Home):
self.endPosX = -0.344
self.endPosY = 0.0
self.endPosZ = 0.110
Publish to /pos_cmd:
rostopic pub /pos_cmd piper_msgs/PosCmd "{
x: -0.344, y: 0.0, z: 0.110,
roll: 0.0, pitch: 0.0, yaw: 0.0,
gripper: 0.0, mode1: 1, mode2: 0
}"
Run detection:
rosrun handpose_det handpose_det.py
RViz shows:
- Red dots = finger points
- Green dots = palm center
3. Define Teleoperation Base Frame
- Gesture
OPEN_HAND→ change toFIST→ hold for 3s - Publishes
/base_frame_originpose - Maps hand’s relative motion to PiPER motion
Full Code
#!/usr/bin/env python3
# _*_ coding:utf-8 _*_
import rospy
import cv2
import mediapipe as mp
import numpy as np
from sensor_msgs.msg import Image
from cv_bridge import CvBridge
from visualization_msgs.msg import MarkerArray, Marker
from geometry_msgs.msg import PoseStamped
from piper_msgs.msg import PosCmd
class HandPoseTeleop:
def __init__(self):
rospy.init_node('handpose_teleop', anonymous=True)
self.bridge = CvBridge()
self.image_sub = rospy.Subscriber("/camera/color/image_raw", Image, self.image_callback)
self.marker_pub = rospy.Publisher("/hand_markers", MarkerArray, queue_size=1)
self.base_pub = rospy.Publisher("/base_frame_origin", PoseStamped, queue_size=1)
self.cmd_pub = rospy.Publisher("/pos_cmd", PosCmd, queue_size=1)
# Custom home point
self.endPosX = -0.344
self.endPosY = 0.0
self.endPosZ = 0.110
# Mediapipe
self.mp_hands = mp.solutions.hands
self.hands = self.mp_hands.Hands(min_detection_confidence=0.7,
min_tracking_confidence=0.5)
self.mp_draw = mp.solutions.drawing_utils
rospy.loginfo("HandPoseTeleop initialized.")
def image_callback(self, msg):
cv_image = self.bridge.imgmsg_to_cv2(msg, "bgr8")
img_rgb = cv2.cvtColor(cv_image, cv2.COLOR_BGR2RGB)
results = self.hands.process(img_rgb)
marker_array = MarkerArray()
if results.multi_hand_landmarks:
for hand_landmarks in results.multi_hand_landmarks:
# Publish markers
for i, lm in enumerate(hand_landmarks.landmark):
marker = Marker()
marker.header.frame_id = "camera_link"
marker.type = Marker.SPHERE
marker.action = Marker.ADD
marker.scale.x = marker.scale.y = marker.scale.z = 0.01
marker.color.a = 1.0
marker.color.r = 1.0
marker.pose.position.x = lm.x
marker.pose.position.y = lm.y
marker.pose.position.z = lm.z
marker_array.markers.append(marker)
# Compute palm center (average of landmarks)
palm_center = np.mean([[lm.x, lm.y, lm.z] for lm in hand_landmarks.landmark], axis=0)
marker = Marker()
marker.header.frame_id = "camera_link"
marker.type = Marker.SPHERE
marker.action = Marker.ADD
marker.scale.x = marker.scale.y = marker.scale.z = 0.02
marker.color.a = 1.0
marker.color.g = 1.0
marker.pose.position.x = palm_center[0]
marker.pose.position.y = palm_center[1]
marker.pose.position.z = palm_center[2]
marker_array.markers.append(marker)
# Send position command
pos_cmd = PosCmd()
pos_cmd.x = self.endPosX + palm_center[0]
pos_cmd.y = self.endPosY + palm_center[1]
pos_cmd.z = self.endPosZ + palm_center[2]
pos_cmd.roll = 0.0
pos_cmd.pitch = 0.0
pos_cmd.yaw = 0.0
pos_cmd.gripper = 0.0
pos_cmd.mode1 = 1
pos_cmd.mode2 = 0
self.cmd_pub.publish(pos_cmd)
self.marker_pub.publish(marker_array)
if __name__ == '__main__':
try:
teleop = HandPoseTeleop()
rospy.spin()
except rospy.ROSInterruptException:
pass
Summary
This demo shows how the AgileX PiPER robotic arm can be controlled via hand gesture recognition and a depth camera, achieving real-time position teleoperation.
We’d love to hear your feedback! ![]()
- How would you extend this (e.g. orientation, multi-modal input)?
- Anyone interested in integrating this with VR/AR or haptics?
Full repo & details: AgileX-College on GitHub
AgileX Robotics
https://global.agilex.ai/


