The Cost of Uncertainty in Self-play Reinforcement Learning and SearchShow others and affiliations
2025 (English)In: 2025 IEEE International Conference on Intelligence and Security Informatics (ISI), Institute of Electrical and Electronics Engineers (IEEE) , 2025, p. 113-120Conference paper, Published paper (Refereed)
Abstract [en]
The combination of reinforcement learning and look-ahead search introduced in AlphaGo, has revolutionized our understanding of tactics and strategy in classical strategy games such as Go and chess. Until recently, this pioneering approach has been limited to perfect information games, where players have full information about the current state of the game. This paper investigates the recent generalization of reinforcement learning with search to imperfect information games, such as poker, where parts of the game state, e.g., the opponent’s hand, is hidden from the player. The paper explores how well this approach scales as the amount of hidden information increases. To this end, the current state of the art in reinforcement learning with search, the student of games general learning algorithm, is reproduced and evaluated across three variants of a custom poker game, each differing by the number of hidden cards dealt to players. It is found that games with less hidden information are learned more effectively, and that computational demands scale sublinearly with increasing hidden information.
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2025. p. 113-120
Keywords [en]
computer poker, counterfactual regret minimization, imperfect information games, Reinforcement learning, student of games algorithm, tree search
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-377986DOI: 10.1109/ISI65680.2025.11201174Scopus ID: 2-s2.0-105030660994OAI: oai:DiVA.org:kth-377986DiVA, id: diva2:2045402
Conference
21st Annual IEEE International Conference on Intelligence and Security Informatics, ISI 2025, Hong Kong, China, July 12-13, 2025
Note
Part of ISBN 9798331512767
QC 20260312
2026-03-122026-03-122026-03-12Bibliographically approved