kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
The effect of target normalization and momentum on dying relu
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0001-6824-6443
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL. KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Centre for Autonomous Systems, CAS.ORCID iD: 0000-0002-8750-0897
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0003-3958-6179
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL. KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Centre for Autonomous Systems, CAS.ORCID iD: 0000-0003-2965-2953
(English)Manuscript (preprint) (Other academic)
National Category
Engineering and Technology
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-279032OAI: oai:DiVA.org:kth-279032DiVA, id: diva2:1457522
Note

QC 20200921

Available from: 2020-08-11 Created: 2020-08-11 Last updated: 2022-06-26Bibliographically approved
In thesis
1. Transfer Learning using low-dimensional Representations in Reinforcement Learning
Open this publication in new window or tab >>Transfer Learning using low-dimensional Representations in Reinforcement Learning
2020 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Successful learning of behaviors in Reinforcement Learning (RL) are often learned tabula rasa, requiring many observations and interactions in the environment. Performing this outside of a simulator, in the real world, often becomes infeasible due to the large amount of interactions needed. This has motivated the use of Transfer Learning for Reinforcement Learning, where learning is accelerated by using experiences from previous learning in related tasks. In this thesis, I explore how we can transfer from a simple single-object pushing policy, to a wide array of non-prehensile rearrangement problems. I then explain how we can model task differences using a low-dimensional latent variable representation to make adaption to novel tasks efficient. Lastly, the dependence of accurate function approximation is sometimes problematic, especially in RL, where statistics of target variables are not known a priori. I present observations, along with explanations, that small target variances along with momentum optimization of ReLU-activated neural network parameters leads to dying ReLU.

Abstract [sv]

Framgångsrik inlärning av beteenden inom ramen för Reinforcement Learning (RL) sker ofta tabula rasa och kräver stora mängder observationer och interaktioner. Att använda RL-algoritmer utanför simulering, i den riktiga världen, är därför ofta inte praktiskt utförbart. Detta har motiverat studier i Transfer Learning för RL, där inlärningen accelereras av erfarenheter från tidigare inlärning av liknande uppgifter. I denna licentiatuppsats utforskar jag hur vi kan vi kan åstadkomma transfer från en enklare manipulationspolicy, till en större samling omarrangeringsproblem. Jag fortsätter sedan med att beskriva hur vi kan modellera hur olika inlärningsproblem skiljer sig åt med hjälp av en lågdimensionell parametrisering, och på så vis effektivisera inlärningen av nya problem. Beroendet av bra funktionsapproximation är ibland problematiskt, särskilt inom RL där statistik om målvariabler inte är kända i förväg. Jag presenterar därför slutligen observationer, och förklaringar, att små varianser för målvariabler tillsammans med momentum-optimering leder till dying ReLU.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2020. p. 123
Series
TRITA-EECS-AVL ; 2020:39
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-279120 (URN)978-91-7873-593-8 (ISBN)
Presentation
2020-09-22, 304, Teknikringen 14, Stockholm, 10:00 (English)
Opponent
Supervisors
Note

QC 20200819

Available from: 2020-08-19 Created: 2020-08-16 Last updated: 2022-06-26Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records

Arnekvist, IsacPinto Basto de Carvalho, Joao FredericoStork, Johannes AndreasKragic, Danica

Search in DiVA

By author/editor
Arnekvist, IsacPinto Basto de Carvalho, Joao FredericoStork, Johannes AndreasKragic, Danica
By organisation
Robotics, Perception and Learning, RPLCentre for Autonomous Systems, CAS
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 115 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf