Grasping unknown objects based on visual input, where no a priori knowledge about the objects is used, is a challenging problem. In this paper, we present an Early Cognitive Vision system that builds a hierarchical representation based on edge and texture information which provides a sparse but powerful description of the scene. Based on this representation, we generate contour-based and surface-based grasps. We test our method in two real-world scenarios, as well as on a vision-based grasping benchmark providing a hybrid scenario using real-world stereo images as input and a simulator for extensive and repetitive evaluation of the grasps. The results show that the proposed method is able to generate successful grasps, and in particular that the contour and surface information are complementary for the task of grasping unknown objects. This allows for dealing with rather complex scenes.
QC 20121009