Learning Dexterous Grasping with Object-Centric Visual Affordances

Priyanka Mandikal1 Kristen Grauman1,2
1UT Austin,2Facebook AI Research

Accepted at ICRA 2021
[Code Coming Soon!]

Dexterous robotic hands are appealing for their agility and human-like morphology, yet their high degree of freedom makes learning to manipulate challenging. We introduce an approach for learning dexterous grasping. Our key idea is to embed an object-centric visual affordance model within a deep reinforcement learning loop to learn grasping policies that favor the same object regions favored by people. Unlike traditional approaches that learn from human demonstration trajectories (e.g., hand joint sequences captured with a glove), the proposed prior is object-centric and image-based, allowing the agent to anticipate useful affordance regions for objects unseen during policy learning. We demonstrate our idea with a 30-DoF five-fingered robotic hand simulator on 40 objects from two datasets, where it successfully and efficiently learns policies for stable grasps. Our affordance-guided policies are significantly more effective, generalize better to novel objects, and train 3 X faster than the baselines. Our work offers a step towards manipulation agents that learn by watching how people use objects, without requiring state and action information about the human body.

In this work, we aim to learn deep RL grasping policies for a dexterous robotic hand by injecting a visual affordance prior that encourages using parts of the object used by people for functional grasping. Given an object image, we predict the affordance regions, and use it to influence the learned policy. The key upshots of our approach are better grasping, faster learning, and generalization to successfully grasp objects unseen during policy training.

Long Video

Copyright © 2020 University of Texas at Austin