Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A Value Iteration Algorithm for Partially Observed Markov Decision Process Multi-armed Bandits
UBC, Canada.
KTH, Superseded Departments, Signals, Sensors and Systems.ORCID iD: 0000-0002-1927-1690
KTH, Superseded Departments, Signals, Sensors and Systems.
2004 (English)In:  , 2004Conference paper, Published paper (Refereed)
Abstract [en]

A value iteration based algorithm is given for computing the Gittins index of a Partially Observed Markov Decision Process (POMDP) Multi-armed Bandit problem. This problem concerns dynamical allocation of efforts between a number of competing projects of which only one can be worked on at any time period. The active project evolves according to a finite state Markov chain and generates then a reward, while the states of the idle projects remain fixed. In this contribution, it is assumed that the state of the active project only can be indirectly observed from noisy observations. The objective is to find the optimal policy based on partial information to determine which project to work on at a certain time in order to maximize the total expected reward. The solution is obtained by transforming the problem into a standard POMDP problem, for which there exist efficient near-optimal algorithms. A numerical example from the field of task planning for an autonomous robot is presented to illustrate the algorithms.

Place, publisher, year, edition, pages
2004.
National Category
Control Engineering
Identifiers
URN: urn:nbn:se:kth:diva-57927OAI: oai:DiVA.org:kth-57927DiVA: diva2:472700
Conference
Mathematical Theory of Networks and Systems (MTNS), Leuven, Belgium
Note
QC 20120301Available from: 2012-01-04 Created: 2012-01-04 Last updated: 2013-09-05Bibliographically approved

Open Access in DiVA

No full text

Other links

http://www.mtns2004.be/

Authority records BETA

Wahlberg, Bo

Search in DiVA

By author/editor
Wahlberg, BoLingelbach, Frank
By organisation
Signals, Sensors and Systems
Control Engineering

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 19 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf