Learning accurate and efficient three-finger grasp generation in clutters with an auto-annotated large-scale datasetShow others and affiliations
2025 (English)In: Robotics and Computer-Integrated Manufacturing, ISSN 0736-5845, E-ISSN 1879-2537, Vol. 91, article id 102822Article in journal (Refereed) Published
Abstract [en]
With the development of intelligent manufacturing and robotic technologies, the capability of grasping unknown objects in unstructured environments is becoming more prominent for robots with extensive applications. However, current robotic three-finger grasping studies only focus on grasp generation for single objects or scattered scenes, and suffer from high time expenditure to label grasp ground truth, making them incapable of predicting grasp poses for cluttered objects or generating large-scale datasets. To address such limitations, we first introduce a novel three-finger grasp representation with fewer prediction dimensions, which balances the training difficulty and representation accuracy to obtain efficient grasp performance. Based on this representation, we develop an auto-annotation pipeline and contribute a large-scale three-finger grasp dataset (TF-Grasp Dataset). Our dataset contains 222,720 RGB-D images with over 2 billion grasp annotations in cluttered scenes. In addition, we also propose a three-finger grasp pose detection network (TF-GPD), which detects globally while fine-tuning locally to predict high-quality collision-free grasps from a single-view point cloud. In sum, our work addresses the issue of high-quality collision-free three-finger grasp generation in cluttered scenes based on the proposed pipeline. Extensive comparative experiments show that our proposed methodology outperforms previous methods and improves the grasp quality and efficiency in clutters. The superior results in real-world robot grasping experiments not only prove the reliability of our grasp model but also pave the way for practical applications of three-finger grasping. Our dataset and source code will be released.
Place, publisher, year, edition, pages
Elsevier BV , 2025. Vol. 91, article id 102822
Keywords [en]
Deep learning and computer vision in grasp detection, Grasp dataset, Grasp representation, Robotic three-finger grasp generation
National Category
Robotics and automation Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:kth:diva-350961DOI: 10.1016/j.rcim.2024.102822ISI: 001271656200001Scopus ID: 2-s2.0-85198557167OAI: oai:DiVA.org:kth-350961DiVA, id: diva2:1885636
Note
QC 20240725
2024-07-242024-07-242025-02-05Bibliographically approved