Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
The Two-Stage PI2 Control Strategy
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Reglerteknik.ORCID-id: 0000-0002-7422-3966
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Reglerteknik.ORCID-id: 0000-0001-7309-8086
2022 (engelsk)Inngår i: IEEE Control Systems Letters, E-ISSN 2475-1456, Vol. 6, s. 2072-2077Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

PI2 is a stochastic optimal control method generally regarded as a reinforcement learning algorithm. Recent work, however, suggests that the reinforcement learning aspect of PI2 actually appears when optimizing feedforward controls which will lead to optimal closed-loop performance once combined with feedback controls. These feedbacks are necessary to achieve the predicted performance, yet have been largely neglected in the literature and applications due to their complexity. In this letter, we show that the feedbacks actually take a simple-to-implement form for a wide range of system dynamics, paving way for future research and applications of PI2. The correctness of the results is demonstrated through numerical simulations.

sted, utgiver, år, opplag, sider
Institute of Electrical and Electronics Engineers (IEEE) , 2022. Vol. 6, s. 2072-2077
Emneord [en]
Costs, Feedforward systems, Trajectory, Optimal control, System dynamics, Reinforcement learning, Real-time systems, Stochastic optimal control, path integral policy improvement, Feynman-Kac theorem, nonlinear control systems
HSV kategori
Identifikatorer
URN: urn:nbn:se:kth:diva-307337DOI: 10.1109/LCSYS.2021.3137133ISI: 000739631300011Scopus ID: 2-s2.0-85122063949OAI: oai:DiVA.org:kth-307337DiVA, id: diva2:1631293
Merknad

QC 20220124

Tilgjengelig fra: 2022-01-24 Laget: 2022-01-24 Sist oppdatert: 2023-12-07bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekstScopus

Person

Várnai, PéterDimarogonas, Dimos V.

Søk i DiVA

Av forfatter/redaktør
Várnai, PéterDimarogonas, Dimos V.
Av organisasjonen
I samme tidsskrift
IEEE Control Systems Letters

Søk utenfor DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 100 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf