kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Partial caging: a clearance-based definition, datasets, and deep learning
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0003-3827-3824
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0002-0900-1523
Show others and affiliations
2021 (English)In: Autonomous Robots, ISSN 0929-5593, E-ISSN 1573-7527, Vol. 45, no 5, p. 647-664Article in journal (Refereed) Published
Abstract [en]

Caging grasps limit the mobility of an object to a bounded component of configuration space. We introduce a notion of partial cage quality based on maximal clearance of an escaping path. As computing this is a computationally demanding task even in a two-dimensional scenario, we propose a deep learning approach. We design two convolutional neural networks and construct a pipeline for real-time planar partial cage quality estimation directly from 2D images of object models and planar caging tools. One neural network, CageMaskNN, is used to identify caging tool locations that can support partial cages, while a second network that we call CageClearanceNN is trained to predict the quality of those configurations. A partial caging dataset of 3811 images of objects and more than 19 million caging tool configurations is used to train and evaluate these networks on previously unseen objects and caging tool configurations. Experiments show that evaluation of a given configuration on a GeForce GTX 1080 GPU takes less than 6 ms. Furthermore, an additional dataset focused on grasp-relevant configurations is curated and consists of 772 objects with 3.7 million configurations. We also use this dataset for 2D Cage acquisition on novel objects. We study how network performance depends on the datasets, as well as how to efficiently deal with unevenly distributed training data. In further analysis, we show that the evaluation pipeline can approximately identify connected regions of successful caging tool placements and we evaluate the continuity of the cage quality score evaluation along caging tool trajectories. Influence of disturbances is investigated and quantitative results are provided.

Place, publisher, year, edition, pages
Springer Nature , 2021. Vol. 45, no 5, p. 647-664
Keywords [en]
Artificial Intelligence
National Category
Robotics and automation
Identifiers
URN: urn:nbn:se:kth:diva-304608DOI: 10.1007/s10514-021-09969-6ISI: 000614686700001Scopus ID: 2-s2.0-85100502579OAI: oai:DiVA.org:kth-304608DiVA, id: diva2:1609497
Funder
Knut and Alice Wallenberg Foundation
Note

QC 20211109

Available from: 2021-11-08 Created: 2021-11-08 Last updated: 2025-02-09Bibliographically approved
In thesis
1. Learning Structured Representations for Rigid and Deformable Object Manipulation
Open this publication in new window or tab >>Learning Structured Representations for Rigid and Deformable Object Manipulation
2021 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The performance of learning based algorithms largely depends on the given representation of data. Therefore the questions arise, i) how to obtain useful representations, ii) how to evaluate representations, and iii) how to leverage these representations in a real-world robotic setting. In this thesis, we aim to answer all three of this questions in order to learn structured representations for rigid and deformable object manipulation. We firstly take a look into how to learn structured representation and show that imposing structure, informed from task priors, into the representation space is beneficial for certain robotic tasks. Furthermore we discuss and present suitable evaluation practices for structured representations as well as a benchmark for bimanual cloth manipulation. Finally, we introduce the Latent SpaceRoadmap (LSR) framework for visual action planning, where raw observations are mapped into a lower-dimensional latent space. Those are connected via the LSR, and visual action plans are generated that are able to perform a wide range of tasks. The framework is validated on a simulated rigid box stacking task, a simulated hybrid rope-box manipulation task, and a T-shirt folding task performed on a real robotic system.

Abstract [sv]

Prestandan av inlärningbaserade algoritmer beror på stor del av hur datan representeras. Av denna anledning ställs följande frågor: (i) hur vi tar fram användarbara representationer, (ii) hur utvärderar vi dem samt (iii) hur kan vi använda dem i riktiga robotikscenarier. I den här avhandlingen försöker vi att svara på dessa frågor för att hitta inlärda, strukturerade, representationer för manipulation av rigida och icke-rigida objekt. Först behandlar vi hur man kan lära in en strukturerad representation och visar att inkorporering av struktur, genom användandet av statistiska priors, är fördelaktigt inom vissa robotikuppgifter. Vidare så diskuterar vi passande tillvägagångssätt för att utvärdera strukturerade representationer, samt presenterar ett standardiserat test för tygmanipulering för robotar med två armar. Till sist så introducerar vi ramverket Latent Space Roadmap (LSR) för visuell beslutsplanering, där råa observationer mappas till en lågdimensionell latent rymd. Dessa punkter kopplas samman med hjälp av LSR, och visuella beslutsplaner genereras för en simulerad uppgift för att placera objekt i staplar, för manipulation av ett rep, samt för att vika T-shirts på ett riktigt robotiksystem.

Place, publisher, year, edition, pages
Stockholm, Sweden: KTH Royal Institute of Technology, 2021. p. 44
Series
TRITA-EECS-AVL ; 2021:72
Keywords
Representation learning, Object Manipulation
National Category
Robotics and automation
Research subject
Electrical Engineering
Identifiers
urn:nbn:se:kth:diva-304615 (URN)978-91-8040-050-3 (ISBN)
Public defence
2021-11-09, https://kth-se.zoom.us/j/66216068903, Ångdomen, Osquars backe 31, Stockholm, 15:00 (English)
Opponent
Supervisors
Note

QC 20211109

Available from: 2021-11-09 Created: 2021-11-08 Last updated: 2025-02-09Bibliographically approved

Open Access in DiVA

fulltext(11804 kB)90 downloads
File information
File name FULLTEXT01.pdfFile size 11804 kBChecksum SHA-512
776fa4bd29147695af6afd3a06226ea2191371322336ca48b50e212543a6355f87383e61bfd13f533f3a07d216a776562e7cc84b6126a609cb846135e9c54ab7
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Welle, Michael C.Varava, AnastasiiaKragic, DanicaPokorny, Florian T.

Search in DiVA

By author/editor
Welle, Michael C.Varava, AnastasiiaKragic, DanicaPokorny, Florian T.
By organisation
Robotics, Perception and Learning, RPL
In the same journal
Autonomous Robots
Robotics and automation

Search outside of DiVA

GoogleGoogle Scholar
Total: 90 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 237 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf