Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Structured Stochastic Bandits
KTH, School of Electrical Engineering (EES), Automatic Control.ORCID iD: 0000-0003-2988-8464
2016 (English)Licentiate thesis, monograph (Other academic)
Abstract [en]

In this thesis we address the multi-armed bandit (MAB) problem with stochastic rewards and correlated arms. Particularly, we investigate the case when the expected rewards are a Lipschitz function of the arm, and the learning to rank problem, as viewed from a MAB perspective. For the former, we derive a problem specific lower bound and propose both an asymptotically optimal algorithm (OSLB) and a (pareto)optimal, algorithm (POSLB). For the latter, we construct the regret lower bound and determine its closed form for some particular settings, as well as propose two asymptotically optimal algorithms PIE and PIE-C. For all algorithms mentioned above, we present performance analysis in the form of theoretical regret guarantees as well as numerical evaluation on artificial datasets as well as real-world datasets, in the case of PIE and PIE-C.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2016. , vii, 94 p.
Series
TRITA-EE, ISSN 1653-5146 ; 2016:021
Keyword [en]
Multi-armed bandits, Learning to rank, reinforcement learning, Lipschitz Bandits
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Electrical Engineering
Identifiers
URN: urn:nbn:se:kth:diva-182816ISBN: 978-91-7595-880-4 (print)OAI: oai:DiVA.org:kth-182816DiVA: diva2:905873
Presentation
2016-03-15, Q2, Osquldasväg 10, KTH, Stockholm, 10:00 (English)
Opponent
Supervisors
Funder
EU, European Research Council
Note

QC 20160223

Available from: 2016-02-23 Created: 2016-02-23 Last updated: 2016-02-23Bibliographically approved

Open Access in DiVA

Thesis(931 kB)231 downloads
File information
File name FULLTEXT01.pdfFile size 931 kBChecksum SHA-512
ac58d46e6fa9f1128ea0f1c876058c39bc2fcaa63f102c14ed88ee62c5750678d18a0e20f57ec0a3d53d7181a1cf6716c9a4baed1e7fdae2ad64a3c424ec89c7
Type fulltextMimetype application/pdf

Authority records BETA

Magureanu, Stefan

Search in DiVA

By author/editor
Magureanu, Stefan
By organisation
Automatic Control
Other Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 231 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 449 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf