kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Stabilizing Side Effects of Experience Replay With Different Network Sizes for Deep Q-Network
KTH, School of Electrical Engineering and Computer Science (EECS).
2023 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

This report investigates the effects of two different types of batch selection used for traininga Deep Reinforcement Learning agent in games. More specifically, the impact of thedifferent methods were tested for different sizes of Deep Neural Networks while using theDeep Q-Network (DQN) algorithm. The two methods investigated were Random batchselection and Combined Experience Replay (CER). Random batch selection is the mostcommonly used method while CER is a more recent method with low additionalcomputational cost. These two methods were tested on the two classic games Snake andSuper Mario Bros, using DQN and a variety of Deep Neural Network Sizes. It was seen thatthe CER method improved stability between the different network sizes while not reducingthe learning rate compared to the Random method. This reduces the difficulty of tuning theDeep Neural Network size while trying to optimise the agent.

Abstract [sv]

Den här rapporten undersöker effekterna av två olika typer av batchval som används för attträna en Djupa Förstärkning Lärande-agent i spel. Mer specifikt testades effekten av de olikametoderna för olika storlekar av Djupa Neurala Nätverk som används i Djupa Q-Nätverk-algoritmen. De två metoderna som undersöktes var Slumpmässigt batchval och CombinedExperience Replay (CER). Slumpmässigt batchval är den mest använda metoden medanCER är en nyare metod med låga extra beräknings kostnader. Dessa två metoder testadespå de två klassiska spelen Snake och Super Mario Bros med hjälp av DQN algoritmen ochvarierande nätvärksstorlekar. Det sågs att CER-metoden förbättrade stabiliteten mellan deolika nätverksstorlekarna samtidigt som den inte minskade inlärningshastigheten jämförtmed Slumpmässigt batchval. Detta minskar svårigheten att justera storleken på det DjupaNeurala Nätverket när man försöker optimera sin agenten.

Place, publisher, year, edition, pages
2023. , p. 103-111
Series
TRITA-EECS-EX ; 2023:142
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:kth:diva-341401OAI: oai:DiVA.org:kth-341401DiVA, id: diva2:1821276
Supervisors
Examiners
Projects
Kandidatexjobb i elektroteknik 2023, KTH, StockholmAvailable from: 2023-12-19 Created: 2023-12-19

Open Access in DiVA

fulltext(211487 kB)382 downloads
File information
File name FULLTEXT01.pdfFile size 211487 kBChecksum SHA-512
69786101c351a58f7bd524c3aeee40c661028b577366c4a725033372b88c624c87c2183b6acca2d3d43bbd2bb2f3942326c69263e70c99cf1db027ce9c4e9ae2
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 382 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 203 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf