Robots Gain Ability to Master Object Manipulation with Context-Aware Technique

By Joshua Preston

Georgia Institute of Technology researchers have developed one of the most robust research methods currently available to allow robots to correctly pick up common objects based on how they should be used.

Whereas humans might touch a hot pan on a stove once and never forget the lesson, it’s more complex to train robots to apply such universal knowledge across different situations.

The new technique, called CAGE, or Context-Aware Grasping Engine, takes into consideration a range of factors – such as the task the object will be used for, whether the object is full or empty, what it’s made of, and its shape – so that a robot can learn the right way to grasp various objects in a given context. For example, it allows a robot to learn not to hold a hot cup of tea by its opening, or to handle a cooking pot differently based on whether it just left a stovetop or a cabinet.

“In order for robots to effectively perform object manipulation, a broad sense of contexts, including object and task constraints, needs to be accounted for,” said Weiyu Liu, lead researcher on CAGE and Ph.D. student in robotics.

Using CAGE, a robot is able to apply what it has learned to objects it’s never seen. For example, if trained to grasp a spatula by the handle to make a scooping motion, the robot is able to generalize this knowledge and know to grasp a mug by the handle and use it to scoop — if that was the programmed task — even if the robot has never encountered a mug before.

The research team, from the Robot Autonomy and Interactive Learning (RAIL) lab at Georgia Tech, validated their approach against three existing methods for teaching robots to handle objects. The team used a novel dataset consisting of 14,000 grasps for 44 objects, 7 tasks, and 6 different object states (e.g. objects contained solids, liquids, or were empty).

CAGE outperformed the other methods in a simulation by statistically significant margins, according to the researchers, highlighting the model’s ability to collectively reason about contextual information.

CAGE had an 86 percent success rate when averaged across tests looking at how well it identified context-aware grasps and if the model could generalize to new objects a robot had not seen previously. Among the existing methods, the highest success rate averaged 69 percent.

Liu said that the team’s model can rank grasp “candidates” for various contexts, ensuring that more suitable candidates are ranked higher than less suitable ones given a context. So a robot might, for example, learn to hand a sharp metal knife to a person handle-first, but hand over a plastic knife in any orientation due to its relative safety.

A final experiment evaluated the effectiveness of CAGE using a Fetch robot equipped with a camera, moving arm, and a parallel-jaw gripper. It performed almost perfectly in making a judgement on how to grasp objects for several distinct tasks, including scooping, pouring, lifting, and handing over an object, among others. If there was no suitable grasp for the given situation, the robot made no attempt in all cases.

The work, developed by Liu, Angel Daruna, and Sonia Chernova, was accepted into the International Conference on Robotics and Automation, taking place virtually this June. The paper is titled CAGE: Context-Aware Grasping Engine and the research data is publicly available at https://github.com/wliu88/rail_semantic_grasping.

This work is supported in part by NSF IIS 1564080, NSF GRFP DGE-1650044, and ONR N000141612835. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the sponsors.