Change search
ReferencesLink to record
Permanent link

Direct link
Two-target algorithms for infinite-armed bandits with Bernoulli rewards
Telecom Paris Tech.
KTH, School of Electrical Engineering (EES), Automatic Control. (ACL)
2013 (English)In: Advances in Neural Information Processing Systems 26 (2013), Morgan Kaufmann Publishers, 2013Conference paper (Refereed)
Abstract [en]

We consider an infinite-armed bandit problem with Bernoulli rewards. The mean rewards are independent, uniformly distributed over [0; 1]. Rewards 0 and 1 are referred to as a success and a failure, respectively. We propose a novel algorithm where the decision to exploit any arm is based on two successive targets, namely, the total number of successes until the first failure and until the first m failures, respectively, where m is a fixed parameter. This two-target algorithm achieves a long-term average regret in 2√n for a large parameter m and a known time horizon n. This regret is optimal and strictly less than the regret achieved by the best known algorithms, which is in 2√n The results are extended to any mean-reward distribution whose support contains 1 and to unknown time horizons. Numerical experiments show the performance of the algorithm for finite time horizon.

Place, publisher, year, edition, pages
Morgan Kaufmann Publishers, 2013.
, Advances in Neural Information Processing Systems, ISSN 1049-5258 ; 26
Keyword [en]
Bandit problems, Bernoulli, Best-known algorithms, Finite time horizon, M-failure, Novel algorithm, Numerical experiments, Time horizons
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
URN: urn:nbn:se:kth:diva-139319ScopusID: 2-s2.0-84898996413OAI: diva2:685137
27th Annual Conference on Neural Information Processing Systems, NIPS 2013; Lake Tahoe, NV; United States; 5 December 2013 through 10 December 2013
EU, European Research CouncilSwedish Research Council

QC 20140625

Available from: 2014-01-08 Created: 2014-01-08 Last updated: 2014-06-25Bibliographically approved

Open Access in DiVA

fulltext(251 kB)9 downloads
File information
File name FULLTEXT01.pdfFile size 251 kBChecksum SHA-512
Type fulltextMimetype application/pdf


Search in DiVA

By author/editor
Proutière, Alexandre
By organisation
Automatic Control
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 9 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 25 hits
ReferencesLink to record
Permanent link

Direct link