Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
An optimal algorithm for stochastic matroid bandit optimization
KTH, School of Electrical Engineering (EES), Automatic Control.ORCID iD: 0000-0002-1934-7421
KTH, School of Electrical Engineering (EES), Automatic Control.
2016 (English)In: Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS) , 2016, 548-556 p.Conference paper (Refereed)
Abstract [en]

The selection of leaders in leader-follower multi-agent systems can be naturally formulated as a matroid optimization problem. In this paper, we investigate the online and stochastic version of such a problem, where in each iteration or round, we select a set of leaders and then observe a random realization of the corresponding reward, i.e., of the system performance. This problem is referred to as a stochastic matroid bandit, a variant of combinatorial multi-armed bandit problems where the underlying combinatorial structure is a matroid. We consider semi-bandit feedback and Bernoulli rewards, and derive a tight and problem-dependent lower bound on the regret of any consistent algorithm. We propose KL-OSM, a computationally efficient algorithm that exploits the matroid structure. We derive a finite-time upper bound of the regret of KL-OSM that improves the performance guarantees of existing algorithms. This upper bound actually matches our lower bound, i.e., KL-OSM is asymptotically optimal. Numerical experiments attest that KL-OSM outperforms state-of-the-art algorithms in practice, and the difference in some cases is significant. Copyright © 2016, International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org).

Place, publisher, year, edition, pages
International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS) , 2016. 548-556 p.
Keyword [en]
Combinatorial optimization, Matroids, Multi-Armed bandits, Online learning, Regret analysis, Autonomous agents, Combinatorial mathematics, Computer software maintenance, Iterative methods, Matrix algebra, Optimization, Stochastic systems, Asymptotically optimal, Combinatorial structures, Computationally efficient, Multi armed bandit, Multi-armed bandit problem, State-of-the-art algorithms, Multi agent systems
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:kth:diva-207513ScopusID: 2-s2.0-85014273520ISBN: 9781450342391 OAI: oai:DiVA.org:kth-207513DiVA: diva2:1106109
Conference
15th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2016, 9 May 2016 through 13 May 2016
Note

Conference code: 126305; Export Date: 22 May 2017; Conference Paper. QC 20170607

Available from: 2017-06-07 Created: 2017-06-07 Last updated: 2017-06-07Bibliographically approved

Open Access in DiVA

No full text

Scopus

Search in DiVA

By author/editor
Talebi, Mohammad SadeghProutiere, Alexandre
By organisation
Automatic Control
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar

Total: 1 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf