Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Computing monotone policies for Markov decision processes by exploiting sparsity
The University of British Columbia. (Department of Electrical & Computer Engineering)
KTH, School of Electrical Engineering (EES), Automatic Control. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre. (System Identification Group)ORCID iD: 0000-0003-0355-2663
KTH, School of Electrical Engineering (EES), Automatic Control. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre. (System Identification Group)ORCID iD: 0000-0002-1927-1690
2013 (English)In: 2013 3rd Australian Control Conference, AUCC 2013, IEEE , 2013, 6697239- p.Conference paper, Published paper (Refereed)
Abstract [en]

This paper considers Markov decision processes whose optimal policy is a randomized mixture of monotone increasing policies. Such monotone policies have an inherent sparsity structure. We present a two-stage convex optimization algorithm for computing the optimal policy that exploits the sparsity. It combines an alternating direction method of multipliers (ADMM) to solve a linear programming problem with respect to the joint action state probabilities, together with a sub-gradient step that promotes the monotone sparsity pattern in the conditional probabilities of the action given the state. In the second step, sum-of-norms regularization is used to stress the monotone structure of the optimal policy.

Place, publisher, year, edition, pages
IEEE , 2013. 6697239- p.
Keyword [en]
Alternating direction method of multipliers, Conditional probabilities, Convex optimization algorithms, Linear programming problem, Markov Decision Processes, Sparsity patterns, Sparsity structure, State probability
National Category
Control Engineering
Identifiers
URN: urn:nbn:se:kth:diva-137318DOI: 10.1109/AUCC.2013.6697239ISI: 000330581100001Scopus ID: 2-s2.0-84893279383ISBN: 978-1-4799-2497-4 (print)OAI: oai:DiVA.org:kth-137318DiVA: diva2:678713
Conference
2013 3rd Australian Control Conference, AUCC 2013; Fremantle, WA; Australia; 4 November 2013 through 5 November 2013
Note

QC 20140225

Available from: 2013-12-12 Created: 2013-12-12 Last updated: 2014-03-20Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Authority records BETA

Rojas, Cristian R.Wahlberg, Bo

Search in DiVA

By author/editor
Rojas, Cristian R.Wahlberg, Bo
By organisation
Automatic ControlACCESS Linnaeus Centre
Control Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 53 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf