kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Time-normalised discounting in reinforcement learning
KTH, School of Engineering Sciences (SCI).
KTH, School of Engineering Sciences (SCI).
2024 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Reinforcement learning has emerged as a powerful paradigm in machinelearning, witnessing remarkable progress in recent years. Amongreinforcement algorithms, Q-learning stands out, enabling agents tolearn quickly from past actions. This study aims to investigate andenhance Q-learning methodologies, with a specific focus on tabularQ-learning. In particular, it addresses Q-learning with an actionspace containing actions that require different amounts of time toexecute. With such an action space the algorithm might convergeto a suboptimal solution when using a constant discount factor sincediscounting occurs per action and not per time step. We refer to thisissue as the non-temporal discounting (NTD) problem. By introducinga time-normalised discounting function, we were able to address theissue of NTD. In addition, we were able to stabilise the solutionby implementing a cost for specific actions. As a result, the modelconverged to the expected solution. Building on these results it wouldbe wise to implement time-normalised discounting in a state-of-the-artreinforcement learning model such as deep Q-learning.

 

Place, publisher, year, edition, pages
2024.
Series
TRITA-SCI-GRU ; 2024:255
Keywords [en]
Time-normalised discounting in reinforcement learning
National Category
Mathematics
Identifiers
URN: urn:nbn:se:kth:diva-348807OAI: oai:DiVA.org:kth-348807DiVA, id: diva2:1878787
Subject / course
Mathematical Statistics
Educational program
Master of Science in Engineering - Engineering Mathematics
Supervisors
Examiners
Available from: 2024-06-27 Created: 2024-06-27 Last updated: 2024-06-27Bibliographically approved

Open Access in DiVA

fulltext(1549 kB)150 downloads
File information
File name FULLTEXT01.pdfFile size 1549 kBChecksum SHA-512
b85adf655fe9d3a65ea4e6948e85f7940c63291d9800bbc9254e6142a145ba167a9444139ec707022bfe1c61d60f94a32ce6f1666213d6b7045720621a50146a
Type fulltextMimetype application/pdf

By organisation
School of Engineering Sciences (SCI)
Mathematics

Search outside of DiVA

GoogleGoogle Scholar
Total: 150 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 428 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf