kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Converting Deterministic Simulators to Realistic Stochastic Models via Data Alignment
KTH, School of Electrical Engineering and Computer Science (EECS).
2019 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Konvertering av deterministiska simulatorer till realistiska stokastiska modeller via datajustering (Swedish)
Abstract [en]

Simulation is commonly used to train agents in Reinforcement Learning since they provide an abundance of data that in many cases can be generated faster than real-time. However, the behaviors learned by the agent are often specific to attributes of the simulator and may not perform well when transferred to the real world. This thesis describes an algorithm that can be used to minimize the discrepancy between simulation and reality. Using this algorithm, it is possible to both identify parameters of the simulator that results in more accurate simulation of reality, and learn a generative model that can produce output that is close to real-world dynamics.

We first show how this algorithm works on a problem that can be solved analytically. We then demonstrate that the algorithm successfully handles more elaborate environments with physics simulation involving contact between objects and control actions.

Abstract [sv]

Simulering används vanligtvis för att träna agenter inom Reinforcement Learning eftersom de erbjuder stora mängder data som i många fall kan genereras fortare än realtid. Dock är de beteenden som agenten lär sig ofta specifika för simulatorns attribut och kommer inte nödvändigtvis att prestera väl när de överförs till verkligheten. Detta arbete beskriver en algoritm som kan användas för att minimera skillnaderna mellan simulering och verklighet. Med denna algoritm är det möjligt att både identifiera de simulatorparametrar som resulterar i bättre simulering av verkligheten, och att lära en generativ modell att producera data som liknar verklig dynamik.

Vi visar först att komponenterna som används i algoritmen är väl anpassade för att lösa ett analytiskt exempelproblem. Vi demonstrerar sedan att algoritmen med framgång hanterar mer sofistikerade miljöer med fysiksimulering som involverar kontakt mellan object samt styrsignaler.

Place, publisher, year, edition, pages
2019. , p. 42
Series
TRITA-EECS-EX ; 2019:705
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-268867OAI: oai:DiVA.org:kth-268867DiVA, id: diva2:1395643
Subject / course
Computer Science
Educational program
Master of Science - Machine Learning
Supervisors
Examiners
Available from: 2020-02-24 Created: 2020-02-24 Last updated: 2022-06-26Bibliographically approved

Open Access in DiVA

fulltext(5695 kB)466 downloads
File information
File name FULLTEXT01.pdfFile size 5695 kBChecksum SHA-512
721c1def882cf8b202a9f939078182519040928f28a403a521fa558f84b5bad93fd8b980e6444ab03e807a6d50a21fdd4789c7c4867b7e3251421502d09730a7
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 467 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 524 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf