Change search
ReferencesLink to record
Permanent link

Direct link
Självlärande Dots & Boxes-spelare.
KTH, School of Computer Science and Communication (CSC).
2011 (Swedish)Independent thesis Advanced level (professional degree), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

This report is about the reinforcement learning-algorithm Q-Learning. The purpose of this work is to implement a self-learning dots & boxes-player which after training will be evaluated against two pre-programmed players. I have investigated how the training period affects how good the self-learning player gets by vary how long it will be exploring all the possible states the game can be in. The results are presented in graphs which are analyzed throughout the work. The self-learning player and the Q-Learning-algorithm are analyzed to find out what it has learned and how it has been taught its strategies during the training period. The result I came to was that the self-learning player needs to play against itself for several hundred thousand of games before it stops to learn. The self-learning player became in all of the tests better than mine pre-programmed players – it even became so good that it beat me the majority of the games I played against it.

Abstract [sv]

Denna rapport handlar om reinforcement learning-algoritmen Q-Learning. Syftet med arbetet är att implementera en självlärande dots & boxes spelare som efter träning får testspela mot två stycken förprogrammerade spelare. Jag har undersökt hur träningsfasen påverkar hur bra den självlärande spelaren blir genom att variera hur länge den ska få utforska alla möjliga tillstånd spelet kan hamna i. Resultaten är framförda i grafer som analyseras i arbetet. Den självlärda spelaren och Q-Learning-algoritmen analyseras för att ta reda på vad det är den har lärt sig och hur den har lärt sig sina strategier under träningsfasen. Resultatet jag kom fram till var att den självlärande spelaren behöver spela flera hundra tusen matcher mot sig själv innan den slutar att lära sig. Den självlärande spelaren blev i alla testerna bättre än mina förprogrammerade spelare – den blev till och med så bra att den besegrade mig majoriteten av matcherna jag spelade mot den.

Place, publisher, year, edition, pages
Kandidatexjobb CSC, K11065
National Category
Computer Science
URN: urn:nbn:se:kth:diva-130855OAI: diva2:654301
Educational program
Master of Science in Engineering - Computer Science and Technology
Available from: 2013-10-07 Created: 2013-10-07

Open Access in DiVA

No full text

Other links
By organisation
School of Computer Science and Communication (CSC)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 19 hits
ReferencesLink to record
Permanent link

Direct link