DexGen

Geometric Retargeting

A Principled, Ultrafast Hand Retargeting Algorithm

Zhao-Heng Yin^1,2, Changhao Wang², Luis Pineda², Krishna Bodduluri²,
Tingfan Wu², Pieter Abbeel¹, and Mustafa Mukadam².

¹BAIR, Berkeley EECS and ²FAIR at Meta

Smooth and Expressive Retargeting

Kinematic retargeting is central to teleoperation. Geometric Retargeting, the unsung hero behind our Dexterity Gen, answers a fundamental question: What makes good retargeting?

Retargeting as a Shape Correspondence Problem

Human fingertips and robot fingertips have significantly different ranges of motion, as illustrated in the comparison below. Retargeting presents a challenge in establishing a non-linear shape correspondence—specifically, how can the desired retargeting or correspondence be accurately defined?

Make Retargeting Principled

We introduce several geometric retargeting principles to optimize human experience: Motion Preservation, Robot Utility Maximization, High Flatness Control, Pinch Correspondence, and Collision Minimization. These principles are independent of each other and form a minimal basis of a good retargeting function.

Explore them in the gallery below.

Principle I: Motion Preservation

When the human fingertip moves in a certain direction, the robot fingertip should do the same.

Principle II: Utility Maximization

The retargeting should be a surjection. Any robot fingertip configuration should be realizable. We implement this as a Chamfer loss.

Principle III: High Flatness Control

The control sensitivity should be nearly uniform everywhere.

Principle IV: Pinch Correspondence

Pinch grasp is a critical event. When a pair of human fingers pinch, the robot hand should replicate the same motion.

Principle V: Collision Minimization

The retargeting function should not lead to collision.

Make Retargeting Fast

We implement our retargeting model as a neural network and we implement principles above as differentiable cost functions to train it, see the figure.
-- It is superfast, across stages: training takes 5-10 minutes; Inference at 1kHz.

GeoRT Understands You Better

GeoRT has high expressiveness and follows human intention well. It can also achieve many useful hand poses for in-hand manipulation easily.

Baseline (Robot Telekinesis / AnyTeleop equivalent)

GeoRT

These advantages give human operators a greater sense of agency, allowing them to maintain better control over grasping.

Conclusion

Retargeting is at the core of human-driven, cross-embodiment robot learning. In our experiments, we notice that even the smallest change in retargeting can make a big difference. We hope that our GeoRT algorithm can inspire new ideas and designs in the future human-in-the-loop learning systems.