kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Reinforcement Learning for Pickup and Delivery Systems
KTH, School of Electrical Engineering and Computer Science (EECS).
KTH, School of Electrical Engineering and Computer Science (EECS).
2023 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

In this project multi-agent reinforcement learning (RL) for a warehouse environmentwith robots delivering packages has been studied. This was done by first implementing the RLalgorithm Q-learning and investigating how the parameters of Q-learning affect the performanceof the algorithm. Q-learning was successfully implemented both with centralized anddecentralized learning. The performance of these implementations were compared and theresults show a faster convergence for the decentralized version. Due to the large memoryrequirements for the implementation of Q-learning only relatively small environments werepossible to simulate.Due to these limitations of Q-learning, a Deep-Q Network (DQN) was implemented to try toachieve scalability. Unfortunately initial problems with the convergence of the network and laterlong run-times lead to a shortage of time for studying DQN and the scalability of DQN was notinvestigated that thoroughly. For DQN convergence was achieved and confirmed for a 3x3 gridwith one agent and one package. Although the results are not fully sufficient, with more tuning ofthe hyperparameters of the network and a more effective implementation of the environment,DQN seems to be a promising extension of Q-learning for the environment presented in theproject.

Abstract [sv]

I detta projekt har multi-agent förstärkningsinlärning (RL) för ett lager medrobotar som levererar paket studerats. Först genom att implementera RL-algoritmen Q-learningoch genomföra en undersökning av hur parametrarna i Q-learning påverkar prestandan avalgoritmen. Q-learning implementerades framgångsrikt både med centraliserat ochdecentraliserat lärande. Resultatet för dessa olika implementationer jämfördes och resultatenvisar en snabbare konvergens för den decentraliserade versionen. På grund av det stora kravetpå minne för implementering av Q-learning var det endast möjligt att simulera relativt småversioner av modellen.På grund av dessa begränsningar med Q-learning implementerades ett Deep-Q Nätverk för attmöjliggöra simuleringar av större miljöer. Tyvärr ledde initiala problem med nätverketskonvergens och senare långa körtider till brist på tid för att noggrannt studera DQN. För DQNuppnåddes och bekräftades konvergens för miljöer upp till en storlek på 3x3 med en agent ochett paket. Trots att resultaten inte är helt tillfredsställande, med justering av nätverketshyperparametrar och en mer effektiv implementering av miljön, verkar DQN kunna vara enlovande förlängning av Q-learning för den miljö som presenteras i projektet.

Place, publisher, year, edition, pages
2023. , p. 113-121
Series
TRITA-EECS-EX ; 2023:143
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:kth:diva-341485OAI: oai:DiVA.org:kth-341485DiVA, id: diva2:1821877
Supervisors
Examiners
Projects
Kandidatexjobb i elektroteknik 2023, KTH, StockholmAvailable from: 2023-12-21 Created: 2023-12-21

Open Access in DiVA

fulltext(211487 kB)368 downloads
File information
File name FULLTEXT01.pdfFile size 211487 kBChecksum SHA-512
69786101c351a58f7bd524c3aeee40c661028b577366c4a725033372b88c624c87c2183b6acca2d3d43bbd2bb2f3942326c69263e70c99cf1db027ce9c4e9ae2
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 368 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 222 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf