CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Investigation of Different Observation and Action Spaces for Reinforcement Learning on Reaching Tasks
KTH, School of Electrical Engineering and Computer Science (EECS).
2019 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Undersökning av olika observations- och handlingsrymder för förstärkninginlärning på uppgifter inom nående (Swedish)
Abstract [en]

Deep reinforcement learning has been shown to be a potential alternative to a traditional controller for robotic manipulation tasks. Most of modern deep reinforcement learning methods that are used on robotic control mostly fall in the so-called model-free paradigm. While model-free methods require less space and have better generalization capability compared to model-based methods, they suffer from higher sample complexity which leads to the problem of sample ineffi ciency. In this thesis, we analyze three modern deep reinforcement learning, model-free methods: deep Q-network, deep deterministic policy gradient, and proximal policy optimization under different representations of the state-action space to gain a better insight of the relation between sample complexity and sample effi ciency. The experiments are conducted on two robotic reaching tasks. The experimental results show that the complexity of observation and action space are highly related to the sample effi ciency during training. This conclusion is in line with corresponding theoretical work in the field.

Abstract [sv]

Djup förstärkningsinlärning har visats vara ett potentiellt alternativ till en traditionell kontrollör för robotmanipuleringsuppgifter. De flesta moderna förstärkninginlärningsmetoderna som används som robotkontrollör hamnar under den så kallade modellfria paradigmen. Medan modellfria metoder behöver mindre utrymme och har bättre generaliseringsmöjligheter i jämförelse med modellbaserade metoder, så lider de av högre urvalskomplexitet vilket leder till problem med urvalsineffektivitet. I detta examensarbete analyserar vi tre moderna djupa modellfria förstärkninginlärningsmetoder: deep Q-network, deep deterministic policy gradient och proximal policy optimization under olika representationer av tillstånds och handlingsrymden för att få en bättre inblick av relationen mellan urvalskomplexiteten och urvalseffektiviteten. De experimentella resultaten visar att komplexiteten av observations och handlingsrymden är högst relaterade till urvalseffektiviteten under träning. Denna slutsats överensstämmer med korresponderande teoretiska arbeten inom fältet.

Place, publisher, year, edition, pages
2019. , p. 61
Series
TRITA-EECS-EX ; 2019:772
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-271182OAI: oai:DiVA.org:kth-271182DiVA, id: diva2:1415901
Subject / course
Computer Science
Educational program
Master of Science - Systems, Control and Robotics
Supervisors
Examiners
Available from: 2020-03-20 Created: 2020-03-20 Last updated: 2020-03-20Bibliographically approved

Open Access in DiVA

fulltext(6031 kB)2 downloads
File information
File name FULLTEXT01.pdfFile size 6031 kBChecksum SHA-512
a78d0dd8a1e70b99e1412062dcf94578d29a57461e4222a87f2af6bff5c89ab5baa5fcf06f2cea9572143a25eb77413b2a5895a2a9e393113e1628260dd6a0a2
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 2 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 4 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf