Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Computing monotone policies for Markov decision processes: a nearly-isotonic penalty approach
KTH, School of Electrical Engineering (EES), Automatic Control.
KTH, School of Electrical Engineering (EES), Automatic Control.
KTH, School of Electrical Engineering (EES), Automatic Control.ORCID iD: 0000-0002-1927-1690
2017 (English)In: IFAC-PapersOnLine, ISSN 2405-8963, Vol. 50, no 1, p. 8429-8434Article in journal (Refereed) Published
Abstract [en]

This paper discusses algorithms for solving Markov decision processes (MDPs) that have monotone optimal policies. We propose a two-stage alternating convex optimization scheme that can accelerate the search for an optimal policy by exploiting the monotone property The first stage is a linear program formulated in terms of the joint state-action probabilities. The second stage is a regularized problem formulated in terms of the conditional probabilities of actions given states. The regularization uses techniques from nearly-isotonic regression. While a variety of iterative method can be used in the first formulation of the problem, we show in numerical simulations that, in particular, the alternating method of multipliers (ADMM) can be significantly accelerated using the regularization step.

Place, publisher, year, edition, pages
Elsevier, 2017. Vol. 50, no 1, p. 8429-8434
Keyword [en]
alternating direction method of multipliers (ADMM), isotonic regression, l1-regularization, Markov decision process (MDP), monotone policy, sparsity, stochastic control
National Category
Control Engineering
Identifiers
URN: urn:nbn:se:kth:diva-223069DOI: 10.1016/j.ifacol.2017.08.1575ISI: 000423964900394Scopus ID: 2-s2.0-85031809673OAI: oai:DiVA.org:kth-223069DiVA, id: diva2:1182465
Funder
Swedish Research Council, 2016-06079
Note

QC 20180213, Funding Agency: Linnaeus Center ACCESS at KTH 

Available from: 2018-02-13 Created: 2018-02-13 Last updated: 2018-03-05Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records BETA

Mattila, Robert

Search in DiVA

By author/editor
Mattila, RobertRojas, CristianWahlberg, Bo
By organisation
Automatic Control
Control Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 296 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf