Change search
ReferencesLink to record
Permanent link

Direct link
Multi-optima exploration with adaptive Gaussian mixture model
KTH, School of Computer Science and Communication (CSC).
2012 (English)In: Development and Learning and Epigenetic Robotics (ICDL), 2012 IEEE International Conference on, IEEE , 2012, 6400808- p.Conference paper (Refereed)
Abstract [en]

In learning by exploration problems such as reinforcement learning (RL), direct policy search, stochastic optimization or evolutionary computation, the goal of an agent is to maximize some form of reward function (or minimize a cost function). Often, these algorithms are designed to find a single policy solution. We address the problem of representing the space of control policy solutions by considering exploration as a density estimation problem. Such representation provides additional information such as shape and curvature of local peaks that can be exploited to analyze the discovered solutions and guide the exploration. We show that the search process can easily be generalized to multi-peaked distributions by employing a Gaussian mixture model (GMM) with an adaptive number of components. The GMM has a dual role: representing the space of possible control policies, and guiding the exploration of new policies. A variation of expectation-maximization (EM) applied to reward-weighted policy parameters is presented to model the space of possible solutions, as if this space was a probability distribution. The approach is tested in a dart game experiment formulated as a black-box optimization problem, where the agent's throwing capability increases while it chases for the best strategy to play the game. This experiment is used to study how the proposed approach can exploit new promising solution alternatives in the search process, when the optimality criterion slowly drifts over time. The results show that the proposed multi-optima search approach can anticipate such changes by exploiting promising candidates to smoothly adapt to the change of global optimum.

Place, publisher, year, edition, pages
IEEE , 2012. 6400808- p.
Keyword [en]
Adaptive Gaussian mixture, Black-box optimization, Control policy, Density estimation, Direct policy search, Dual role, Expectation Maximization, Gaussian Mixture Model, Global optimum, Number of components, Optimality criteria, Reward function, Search process, Stochastic optimizations
National Category
Engineering and Technology
URN: urn:nbn:se:kth:diva-118394DOI: 10.1109/DevLrn.2012.6400808ScopusID: 2-s2.0-84872867195ISBN: 978-146734963-5OAI: diva2:606542
2012 IEEE International Conference on Development and Learning and Epigenetic Robotics, ICDL 2012, 7 November 2012 through 9 November 2012, "San Diego,CA"

QC 20130219

Available from: 2013-02-19 Created: 2013-02-18 Last updated: 2013-02-19Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Pervez, Affan
By organisation
School of Computer Science and Communication (CSC)
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 132 hits
ReferencesLink to record
Permanent link

Direct link