kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Cao, Kun
Publications (3 of 3) Show all publications
Cao, K., Xu, X., Jin, W., Johansson, K. H. & Xie, L. (2025). A Differential Dynamic Programming Framework for Inverse Reinforcement Learning. IEEE Transactions on robotics
Open this publication in new window or tab >>A Differential Dynamic Programming Framework for Inverse Reinforcement Learning
Show others...
2025 (English)In: IEEE Transactions on robotics, ISSN 1552-3098, E-ISSN 1941-0468Article in journal (Refereed) Epub ahead of print
Abstract [en]

A differential dynamic programming (DDP)-based framework for inverse reinforcement learning (IRL) is introduced to recover the parameters in the cost function, system dynamics, and constraints from demonstrations. Different from existing work, where DDP was usually used for the inner forward problem, our proposed framework uses it to efficiently compute the gradient required in the outer inverse problem with equality and inequality constraints. The equivalence between the proposed and existing methods based on Pontryagin’s Maximum Principle (PMP) is established. More importantly, using this DDPbased IRL with an open-loop loss function, a closed-loop IRL framework is presented. In this framework, a loss function is proposed to capture the closed-loop nature of demonstrations. It is shown to be better than the commonly used open-loop loss function. We show that the closed-loop IRL framework reduces to a constrained inverse optimal control problem under certain assumptions. Under these assumptions and a rank condition, it is proven that the learning parameters can be recovered from the demonstration data. The proposed framework is extensively evaluated through four numerical robot examples and one realworld quadrotor system. The experiments validate the theoretical results and illustrate the practical relevance of the approach.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Keywords
Constrained Optimal Control, Differential Dynamical Programming, Inverse Optimal Control, Inverse Problems, Inverse Reinforcement Learning
National Category
Control Engineering
Identifiers
urn:nbn:se:kth:diva-372627 (URN)10.1109/TRO.2025.3623769 (DOI)2-s2.0-105019984764 (Scopus ID)
Note

QC 20251111

Available from: 2025-11-11 Created: 2025-11-11 Last updated: 2025-11-11Bibliographically approved
Chen, G., Cao, K., Johansson, K. H. & Hong, Y. (2024). Continuous-Time Damping-Based Mirror Descent for a Class of Non-Convex Multi-Player Games with Coupling Constraints. In: 2024 IEEE 18th International Conference on Control and Automation, ICCA 2024: . Paper presented at 18th IEEE International Conference on Control and Automation, ICCA 2024, Reykjavik, Iceland, June 18-21 2024 (pp. 12-17). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Continuous-Time Damping-Based Mirror Descent for a Class of Non-Convex Multi-Player Games with Coupling Constraints
2024 (English)In: 2024 IEEE 18th International Conference on Control and Automation, ICCA 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 12-17Conference paper, Published paper (Refereed)
Abstract [en]

We study the computation of the global generalized Nash equilibrium (GNE) for a class of non-convex multi-player games, where players' actions are subject to both local and coupling constraints. Due to the non-convex payoff functions, we employ canonical duality to reformulate the setting as a complementary problem. Under given conditions, we reveal the relation between the stationary point and the global GNE. According to the convex-concave properties within the complementary function, we propose a continuous-time mirror descent to compute GNE by generating functions in the Bregman divergence and the damping-based design. Then, we devise several Lyapunov functions to prove that the trajectory along the dynamics is bounded and convergent.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
National Category
Control Engineering
Identifiers
urn:nbn:se:kth:diva-351967 (URN)10.1109/ICCA62789.2024.10591845 (DOI)001294388500003 ()2-s2.0-85200361248 (Scopus ID)
Conference
18th IEEE International Conference on Control and Automation, ICCA 2024, Reykjavik, Iceland, June 18-21 2024
Note

Part of ISBN 9798350354409

QC 20251020

Available from: 2024-08-19 Created: 2024-08-19 Last updated: 2025-10-20Bibliographically approved
Weng, X., Ling, K. V., Liu, H. & Cao, K. (2024). Towards End-to-End GPS Localization with Neural Pseudorange Correction. In: FUSION 2024 - 27th International Conference on Information Fusion: . Paper presented at 27th International Conference on Information Fusion, FUSION 2024, Venice, Italy, July 7-11, 2024. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Towards End-to-End GPS Localization with Neural Pseudorange Correction
2024 (English)In: FUSION 2024 - 27th International Conference on Information Fusion, Institute of Electrical and Electronics Engineers (IEEE) , 2024Conference paper, Published paper (Refereed)
Abstract [en]

The pseudorange error is one of the root causes of localization inaccuracy in GPS. Previous data-driven methods regress and eliminate pseudorange errors using handcrafted intermediate labels. Unlike them, we propose an end-to-end GPS localization framework, E2E-PrNet, to train a neural network for pseudorange correction (PrNet) directly using the final task loss calculated with the ground truth of GPS receiver states. The gradients of the loss with respect to learnable parameters are backpropagated through a Differentiable Nonlinear Least Squares (DNLS) optimizer to PrNet. The feasibility of fusing the data-driven neural network and the model-based DNLS module is verified with GPS data collected by Android phones, showing that E2E-PrNet outperforms the baseline weighted least squares method and the state-of-the-art end-to-end data-driven approach. Finally, we discuss the explainability of E2E-PrNet.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Android phones, deep learning, end-to-end learning, GPS, localization, pseudoranges
National Category
Signal Processing Control Engineering Other Electrical Engineering, Electronic Engineering, Information Engineering Communication Systems Telecommunications
Identifiers
urn:nbn:se:kth:diva-355923 (URN)10.23919/FUSION59988.2024.10706359 (DOI)001334560000087 ()2-s2.0-85207694707 (Scopus ID)
Conference
27th International Conference on Information Fusion, FUSION 2024, Venice, Italy, July 7-11, 2024
Note

Part of ISBN 978-173774976-9, 979-8-3503-7142-0

QC 20250205

Available from: 2024-11-06 Created: 2024-11-06 Last updated: 2025-02-05Bibliographically approved
Organisations

Search in DiVA

Show all publications