kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
GoNet: An Approach-Constrained Generative Grasp Sampling Network
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0002-9486-9238
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0009-0001-6333-9533
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0003-2296-6685
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0003-2965-2953
2023 (English)In: 2023 IEEE-RAS 22nd International Conference on Humanoid Robots, Institute of Electrical and Electronics Engineers (IEEE) , 2023Conference paper, Published paper (Refereed)
Abstract [en]

This work addresses the problem of learning approach-constrained data-driven grasp samplers. To this end, we propose GoNet: a generative grasp sampler that can constrain the grasp approach direction to a subset of SO(3). The key insight is to discretize SO(3) into a predefined number of bins and train GoNet to generate grasps whose approach directions are within those bins. At run-time, the bin aligning with the second largest principal component of the observed point cloud is selected. GoNet is benchmarked against GraspNet, a state-of-the-art unconstrained grasp sampler, in an unconfined grasping experiment in simulation and on an unconfined and confined grasping experiment in the real world. The results demonstrate that GoNet achieves higher success-over-coverage in simulation and a 12%-18% higher success rate in real-world table-picking and shelf-picking tasks than the baseline.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2023.
Series
IEEE-RAS International Conference on Humanoid Robots, ISSN 2164-0572
National Category
Robotics and automation
Identifiers
URN: urn:nbn:se:kth:diva-344667DOI: 10.1109/HUMANOIDS57100.2023.10375235ISI: 001156965200096Scopus ID: 2-s2.0-85164161523OAI: oai:DiVA.org:kth-344667DiVA, id: diva2:1846976
Conference
IEEE-RAS 22nd International Conference on Humanoid Robots (Humanoids), DEC 12-14, 2023, Austin, TX
Note

QC 20240326

Part of ISBN 979-8-3503-0327-8

Available from: 2024-03-26 Created: 2024-03-26 Last updated: 2025-05-14Bibliographically approved
In thesis
1. Approach-constrained Grasp Synthesis and Interactive Perception for Rigid and Deformable Objects
Open this publication in new window or tab >>Approach-constrained Grasp Synthesis and Interactive Perception for Rigid and Deformable Objects
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This thesis introduces methods for two robotic tasks: grasp synthesis and deformable object manipulation. These tasks are connected by interactive perception, where robots actively manipulate objects to improve sensory feed-back and task performance. Achieving a collision-free, successful grasp is essential for subsequent interaction, while effective manipulation of deformable objects broadens real-world applications. For robotic grasp synthesis, we address the challenge of approach-constrained grasping. We introduce two methods: GoNet and CAPGrasp. GoNet learns a grasp sampler that generates grasp poses with approach directions that lie in a selected discretized bin. In contrast, CAPGrasp enables sampling in a continuous space without requiring explicit approach direction annotations in the learning phase, improving the grasp success rate and providing more flexibility for imposing approach constraint. For robotic deformable object manipulation, we focus on manipulating deformable bags with handles—a common daily human activity. We first propose a method that captures scene dynamics and predicts future states in environments containing both rigid spheres and a deformable bag. Our approach employs an object-centric graph representation and an encoder-decoder framework to forecast future graph states. Additionally, we integrate an active camera into the system, explicitly considering the regularity and structure of motion to couple the camera with the manipulator for effective exploration.

To address the common data scarcity issue in both domains, we also develop simulation environments and propose annotated datasets for extensive benchmarking. Experimental results on both simulated and real-world platforms demonstrate the effectiveness of our methods compared to established baselines.

Abstract [sv]

Denna avhandling introducerar metoder för två robotuppgifter: grepp-syntes och manipulering av deformerbara objekt. Dessa uppgifter är sam-mankopplade genom interaktiv perception, där robotar aktivt manipulerar objekt för att förbättra sensorisk feedback och uppgiftsutförande. Att uppnå ett kollisionsfritt, framgångsrikt grepp är avgörande för efterföljande interak-tion, medan effektiv manipulering av deformerbara objekt breddar verkliga tillämpningar. För robotisk greppsyntes tar vi oss an utmaningen med tillvägagångssätt-begränsat grepp. Vi introducerar två metoder: GoNet och CAPGrasp. GoNet lär sig en gripsamplare som genererar gripposer med inflygningsriktningar som ligger i en vald diskretiserad bin. CAPGrasp, däremot, möjliggör sampling i ett kontinuerligt utrymme utan att kräva explicita tillvägagångssättsanvisningar i inlärningsfasen, vilket förbättrar greppets framgångsfrekvens och ger mer flexibilitet för att införa begränsningar för tillvägagångssätt.

För robotmanipulering av deformerbara föremål fokuserar vi på att manipulera deformerbara påsar med handtag - en vanlig mänsklig aktivitet. Vi föreslår först en metod som fångar scenens dynamik och förutsäger framti-da tillstånd i miljöer som innehåller både stela sfärer och en deformerbar påse. Vårt tillvägagångssätt använder en objektcentrerad grafrepresentation och ett ramverk för kodare-avkodare för att förutsäga framtida graftillstånd. Dessutom integrerar vi en aktiv kamera i systemet, och tar uttryckligen hänsyn till rörelsens regelbundenhet och struktur för att koppla ihop kameran med manipulatorn för effektiv utforskning. För att ta itu med det vanliga problemet med databrist i båda domänerna utvecklar vi också simuleringsmiljöer och föreslår kommenterade datauppsättningar för omfattande benchmarking. Experimentella resultat på både simulerade och verkliga plattformar visar effektiviteten hos våra metoder jämfört med etablerade baslinjer.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2025. p. 52
Series
TRITA-EECS-AVL ; 2025:63
National Category
Robotics and automation
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-363359 (URN)978-91-8106-304-2 (ISBN)
Public defence
2025-06-10, https://kth-se.zoom.us/j/68663108750, D3, Lindstedtvägen 9, Stockholm, Stockholm, 14:30 (English)
Opponent
Supervisors
Note

QC 20250514

Available from: 2025-05-14 Created: 2025-05-14 Last updated: 2025-05-20Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Weng, ZehangLu, HaofeiLundell, JensKragic, Danica

Search in DiVA

By author/editor
Weng, ZehangLu, HaofeiLundell, JensKragic, Danica
By organisation
Robotics, Perception and Learning, RPL
Robotics and automation

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 57 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf