kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
The Two-Stage PI2 Control Strategy
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Decision and Control Systems (Automatic Control).ORCID iD: 0000-0002-7422-3966
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Decision and Control Systems (Automatic Control).ORCID iD: 0000-0001-7309-8086
2022 (English)In: IEEE Control Systems Letters, E-ISSN 2475-1456, Vol. 6, p. 2072-2077Article in journal (Refereed) Published
Abstract [en]

PI2 is a stochastic optimal control method generally regarded as a reinforcement learning algorithm. Recent work, however, suggests that the reinforcement learning aspect of PI2 actually appears when optimizing feedforward controls which will lead to optimal closed-loop performance once combined with feedback controls. These feedbacks are necessary to achieve the predicted performance, yet have been largely neglected in the literature and applications due to their complexity. In this letter, we show that the feedbacks actually take a simple-to-implement form for a wide range of system dynamics, paving way for future research and applications of PI2. The correctness of the results is demonstrated through numerical simulations.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2022. Vol. 6, p. 2072-2077
Keywords [en]
Costs, Feedforward systems, Trajectory, Optimal control, System dynamics, Reinforcement learning, Real-time systems, Stochastic optimal control, path integral policy improvement, Feynman-Kac theorem, nonlinear control systems
National Category
Control Engineering
Identifiers
URN: urn:nbn:se:kth:diva-307337DOI: 10.1109/LCSYS.2021.3137133ISI: 000739631300011Scopus ID: 2-s2.0-85122063949OAI: oai:DiVA.org:kth-307337DiVA, id: diva2:1631293
Note

QC 20220124

Available from: 2022-01-24 Created: 2022-01-24 Last updated: 2023-12-07Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Várnai, PéterDimarogonas, Dimos V.

Search in DiVA

By author/editor
Várnai, PéterDimarogonas, Dimos V.
By organisation
Decision and Control Systems (Automatic Control)
In the same journal
IEEE Control Systems Letters
Control Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 62 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf