kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Distributional Reachability for Markov Decision Processes: Theory and Applications
Department of Electrical and Electronic Engineering Imperial College London, London, U.K..ORCID iD: 0000-0003-2338-5487
Department of Computer Science, University of Oxford, Oxford, U.K.
School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore.
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Decision and Control Systems (Automatic Control). Digital Futures, Stockholm, Sweden.ORCID iD: 0000-0001-9940-5929
2024 (English)In: IEEE Transactions on Automatic Control, ISSN 0018-9286, E-ISSN 1558-2523, Vol. 69, no 7, p. 4598-4613Article in journal (Refereed) Published
Abstract [en]

We study distributional reachability for finite Markov decision processes (MDPs) from a control theoretical perspective. Unlike standard probabilistic reachability notions, which are defined over MDP states or trajectories, in this paper reachability is formulated over the space of probability distributions. We propose two set-valued maps for the forward and backward distributional reachability problems: the forward map collects all state distributions that can be reached from a set of initial distributions, while the backward map collects all state distributions that can reach a set of final distributions. We show that there exists a maximal invariant set under the forward map and this set is the region where the state distributions eventually always belong to, regardless of the initial state distribution and policy. The backward map provides an alternative way to solve a class of important problems for MDPs: the study of controlled invariance, the characterization of the domain of attraction, and reach-avoid problems. Three case studies illustrate the effectiveness of our approach.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2024. Vol. 69, no 7, p. 4598-4613
Keywords [en]
Aerospace electronics, Computational modeling, distributional reachability, Markov decision processes, Markov processes, Probabilistic logic, probabilistic reachability, Probability distribution, reach-avoid problems, Safety, set invariance, Trajectory
National Category
Control Engineering
Identifiers
URN: urn:nbn:se:kth:diva-350164DOI: 10.1109/TAC.2023.3341282ISI: 001259639500010Scopus ID: 2-s2.0-85179798691OAI: oai:DiVA.org:kth-350164DiVA, id: diva2:1883386
Note

QC 20240710

Available from: 2024-07-10 Created: 2024-07-10 Last updated: 2024-07-22Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Gao, YulongJohansson, Karl H.

Search in DiVA

By author/editor
Gao, YulongJohansson, Karl H.
By organisation
Decision and Control Systems (Automatic Control)
In the same journal
IEEE Transactions on Automatic Control
Control Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 20 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf