kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Interpretability in Contact-Rich Manipulation via Kinodynamic Images
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL. KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Centre for Autonomous Systems, CAS.ORCID iD: 0000-0003-4933-1778
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL. KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Centre for Autonomous Systems, CAS.ORCID iD: 0000-0003-2171-1429
Division of Systems and Control, Chalmers University of Technology, Gothenburg, Sweden.ORCID iD: 0000-0001-5129-342X
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL. KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Centre for Autonomous Systems, CAS.ORCID iD: 0000-0002-7796-1438
Show others and affiliations
2021 (English)In: 2021 IEEE International Conference on Robotics and Automation (ICRA), Institute of Electrical and Electronics Engineers (IEEE) , 2021, p. 10175-10181Conference paper, Published paper (Refereed)
Abstract [en]

Deep Neural Networks (NNs) have been widely utilized in contact-rich manipulation tasks to model the complicated contact dynamics. However, NN-based models are often difficult to decipher which can lead to seemingly inexplicable behaviors and unidentifiable failure cases. In this work, we address the interpretability of NN-based models by introducing the kinodynamic images. We propose a methodology that creates images from kinematic and dynamic data of contact-rich manipulation tasks. By using images as the state representation, we enable the application of interpretability modules that were previously limited to vision-based tasks. We use this representation to train a Convolutional Neural Network (CNN) and we extract interpretations with Grad-CAM to produce visual explanations. Our method is versatile and can be applied to any classification problem in manipulation tasks to visually interpret which parts of the input drive the model’s decisions and distinguish its failure modes, regardless of the features used. Our experiments demonstrate that our method enables detailed visual inspections of sequences in a task, and high-level evaluations of a model’s behavior.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2021. p. 10175-10181
National Category
Robotics and automation
Identifiers
URN: urn:nbn:se:kth:diva-306890DOI: 10.1109/ICRA48506.2021.9560920ISI: 000771405403018Scopus ID: 2-s2.0-85104066830OAI: oai:DiVA.org:kth-306890DiVA, id: diva2:1624058
Conference
2021 IEEE International Conference on Robotics and Automation (ICRA), Xian`, China, 30 May 2021 through 5 June 2021
Note

Part of proceedings: ISBN 978-1-7281-9077-8

QC 20220503

Available from: 2022-01-03 Created: 2022-01-03 Last updated: 2025-02-09Bibliographically approved
In thesis
1. Safety Aspects of Data-Driven Control in Contact-Rich Manipulation
Open this publication in new window or tab >>Safety Aspects of Data-Driven Control in Contact-Rich Manipulation
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

A crucial step towards robot autonomy-in environments other than the strictly regulated industrial ones-is to create controllers capable of adapting to diverse conditions. Human-centric environments are filled with a plethora of objects with very distinct properties that can still be manipulated without the need to painstakingly model the interaction dynamics. Furthermore, we do not need an explicit model to safely complete our tasks; rather, we rely on our intuition about the evolution of the interaction that is built upon multiple repetitions of the same task.Accurately translating this ability in how we control our robots in contact-rich tasks is almost infeasible if we rely on controllers that operate based on analytical models of the contacts. Instead, it is advantageous to utilize data-driven techniques that approximate the models based on interactions, much like humans do, and encompass the varying dynamics with a single model. However, for this to be a feasible alternative, we need to consider the safety aspects that occur when we move away from rigorous mathematical models and replace them with approximate data-driven ones.

This thesis identifies three safety aspects of data-driven control in contact-rich manipulation: good predictive performance, increased interpretability for the models, and explicit consideration of safe inputs in the face of modelling errors or uninterpretable predictions. The first point is addressed through a model-training scheme that improves the long-term predictions in a food cutting task. In the experiments it is shown that models trained this way are able to adapt to different dynamics efficiently and their prediction error scales better with longer horizons. The second point is addressed by introducing a framework that allows the evaluation of data-driven classification models based on interpretability techniques. The interpretation of the model decisions helps to anticipate failure cases before the model is deployed on the robot, as well as to understand what the models have learned. Finally, the third point is addressed by learning sets of safe states through data. These safe sets are then used to avoid dangerous control inputs in a control scheme that is flexible and adapts to dynamic variations while effectively encouraging the safety of the system.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2022. p. 57
Series
TRITA-EECS-AVL ; 2022:3
Keywords
Robotic manipulation, model learning
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-307662 (URN)978-91-8040-118-0 (ISBN)
Public defence
2022-03-04, U1, Brinellvägen 26, vån 6, Stockholm, 09:00 (English)
Opponent
Supervisors
Note

QC 20220203

Available from: 2022-02-03 Created: 2022-02-02 Last updated: 2025-02-09Bibliographically approved

Open Access in DiVA

fulltext(3537 kB)166 downloads
File information
File name FULLTEXT01.pdfFile size 3537 kBChecksum SHA-512
128df90e7780792b88b19c753da92d1866fcb92cecd53463410dcad47b431ffa9d8b315b3a6b042bef265edab6a7d2086ab6fac0e565b0212b934c128fd56088
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Mitsioni, IoannaMänttäri, JoonatanKarayiannidis, YiannisFolkesson, JohnKragic, Danica

Search in DiVA

By author/editor
Mitsioni, IoannaMänttäri, JoonatanKarayiannidis, YiannisFolkesson, JohnKragic, Danica
By organisation
Robotics, Perception and Learning, RPLCentre for Autonomous Systems, CAS
Robotics and automation

Search outside of DiVA

GoogleGoogle Scholar
Total: 166 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 231 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf