kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Policy Learning with Embedded Koopman Optimal Control
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0002-3599-440x
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0003-3827-3824
KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Centre for Autonomous Systems, CAS.ORCID iD: 0000-0003-2965-2953
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Embedding an optimization process has been explored for imposing efficient and flexible policy structures. Existing work often build upon nonlinear optimization with explicitly unrolling of iteration steps, making policy inference prohibitively expensive for online learning and real-time control. Our approach embeds a linear-quadratic-regulator (LQR) formulation with a Koopman representation, thus exhibiting the tractability from a closed-form solution and richness from a non-convex neural network. We use a few auxiliary objectives and reparameterization to enforce optimality conditions of the policy that can be easily integrated to standard gradient-based learning. Our approach is shown to be effective for learning policies rendering an optimality structure and efficient reinforcement learning, including simulated pendulum control, 2D and 3D walking, and manipulation for both rigid and deformable objects. We also demonstrate real world application in a robot pivoting task.

Keywords [en]
Koopman System and Control; Reinforcement Learning; Differentiable Optimization
National Category
Electrical Engineering, Electronic Engineering, Information Engineering Robotics and automation Control Engineering Computer Systems
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-305685OAI: oai:DiVA.org:kth-305685DiVA, id: diva2:1617137
Note

In submission for L4DC 2022

QC 20211215

Available from: 2021-12-06 Created: 2021-12-06 Last updated: 2025-02-05Bibliographically approved

Open Access in DiVA

fulltext(3998 kB)862 downloads
File information
File name FULLTEXT01.pdfFile size 3998 kBChecksum SHA-512
4b597344d4cfca6c5881ff64c17385b98ea7e228a1cb9aeb9e82ae3da8073f43160275308b089ee83eb2e3e5fadcbfe575687649517a3d822ec633b3839f08a2
Type fulltextMimetype application/pdf

Authority records

Yin, HangWelle, Michael C.Kragic, Danica

Search in DiVA

By author/editor
Yin, HangWelle, Michael C.Kragic, Danica
By organisation
Robotics, Perception and Learning, RPLCentre for Autonomous Systems, CAS
Electrical Engineering, Electronic Engineering, Information EngineeringRobotics and automationControl EngineeringComputer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 862 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 620 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf