kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Hybrid model based hierarchical reinforcement learning for contact rich manipulation task
KTH, School of Electrical Engineering and Computer Science (EECS).
2020 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Hybridmodellbaserad hierarkisk förstärkningsinlärning för kontaktrik manipulationsuppgift (Swedish)
Abstract [en]

Contact-rich manipulation tasks forms a crucial application in industrial, medical and household settings, requiring strong interaction with a complex environment. In order to efficiently engage in such tasks with human-like agility, it is crucial to search for a method which can effectively handle such contact-rich scenarios. In this work, contact-rich tasks are approached from the perspective of a hybrid dynamical system. A novel hierarchical reinforcement learning is developed: model-based option critic which extensively utilises the structure of the hybrid dynamical model of the contact-rich tasks. The proposed method outperforms the state of the art method PPO and also the previous work of hierarchical reinforcement learning: option-critic, in terms of ability to adapt to uncertainty/changes in the contact-rich tasks.

Abstract [sv]

Kontaktrika manipuleringsuppgifter utgör en avgörande applikation i industriella, medicinska och hushållsmiljöer, vilket kräver stark interaktion med en komplex miljö. För att effektivt kunna delta i sådana uppgifter med mänsklig agility är det viktigt att söka efter en metod som effektivt kan hantera sådana kontaktrika scenarier. I detta arbete kontaktas kontaktrika uppgifter från ett dynamiskt hybridhybridperspektiv. En ny hierarkisk förstärkningsinlärning utvecklas: modellbaserad alternativkritiker som i stor utsträckning använder strukturen för den hybriddynamiska modellen för de kontaktrika uppgifterna. Den föreslagna metoden överträffar den moderna metoden PPO och även det tidigare arbetet med hierarkisk förstärkningslärande: alternativkritiker, när det gäller förmågan att anpassa sig till osäkerhet / förändringar i de kontaktrika uppgifterna.

Place, publisher, year, edition, pages
2020. , p. 71
Series
TRITA-EECS-EX ; 2020:743
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-284149OAI: oai:DiVA.org:kth-284149DiVA, id: diva2:1476751
Supervisors
Examiners
Available from: 2020-10-16 Created: 2020-10-15 Last updated: 2022-06-25Bibliographically approved

Open Access in DiVA

fulltext(9575 kB)540 downloads
File information
File name FULLTEXT01.pdfFile size 9575 kBChecksum SHA-512
f5550ff93139535e64bd279b597adff33b3432ccf06e244a9d6a3ea6348707be4a6a700cf3d9dc6062208a61843d98e3a3e005827330728f60e96639113f9ff0
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 541 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 906 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf