kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Publications (7 of 7) Show all publications
Cao, Y., Li, Y., Zou, Z. & Hu, X. (2026). Inverse Continuous-Time Linear Quadratic Regulator: From Control Cost Matrix to Entire Cost Reconstruction. Journal of Systems Science and Complexity
Open this publication in new window or tab >>Inverse Continuous-Time Linear Quadratic Regulator: From Control Cost Matrix to Entire Cost Reconstruction
2026 (English)In: Journal of Systems Science and Complexity, ISSN 1009-6124, E-ISSN 1559-7067Article in journal (Refereed) Epub ahead of print
Abstract [en]

This paper investigates the inverse optimal control problems for continuous-time linear quadratic regulators over finite-time horizons, aiming to reconstruct the control, state, and terminal cost matrices in the objective function from observed optimal inputs. Previous studies have mainly explored the recovery of state cost matrices under the assumptions that the system is controllable and the control cost matrix is given. Motivated by various applications in which the control cost matrix is unknown and needs to be identified, the authors present two reconstruction methods. The first exploits the full trajectory of the feedback matrix and establishes the necessary and sufficient condition for unique recovery. To further reduce the computational complexity, the second method utilizes the feedback matrix at some time points, where sufficient conditions for uniqueness are provided. Moreover, the authors study the recovery of the state and terminal cost matrices in a more general manner. Unlike prior works that assume system controllability, the authors analyse its impact on well-posedness, and derive expressions for unknown matrices for both controllable and uncontrollable cases. Finally, the authors characterize the structural connection between the inverse problems with the control cost matrix either to be reconstructed or given as a prior.

Place, publisher, year, edition, pages
Springer Nature, 2026
Keywords
Differential Riccati equation, inverse optimal control, linear quadratic regulator
National Category
Control Engineering
Research subject
Applied and Computational Mathematics, Optimization and Systems Theory
Identifiers
urn:nbn:se:kth:diva-372692 (URN)10.1007/s11424-026-5437-8 (DOI)001740834100001 ()2-s2.0-105035712114 (Scopus ID)
Note

QC 20260430

Available from: 2025-11-12 Created: 2025-11-12 Last updated: 2026-04-30Bibliographically approved
Cao, Y. (2025). Forward and Inverse Problems in Optimal Control. (Doctoral dissertation). Stockholm, Sweden: KTH Royal Institute of Technology
Open this publication in new window or tab >>Forward and Inverse Problems in Optimal Control
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

In this dissertation, we study two related classes of problems within the field of systems theory: the analysis and optimization of networked dynamical systems, and the reconstruction of unknown cost functions in control and learning frameworks. These problems naturally arise in a wide range of applications, from engineering systems to natural phenomenon.

The first class of problems concerns control and optimization over large-scale networked systems. Achieving controllability efficiently is an essential topic. The problem of ensuring controllability with a minimal number of control inputs is investigated first. Moreover, optimal control placement with limited controls are studied to enhance energy efficiency, and reduce computational and implementation costs. The second class focuses on inverse problems of optimal control, which aim to reconstruct unknown cost functions from observed behaviour. These problems are valuable for uncovering the objectives underlying complex systems in nature and society. Both cases are considered: when the system dynamics are known a priori and when they are unknown.

Specifically, Paper A investigates the minimal control placement problem for networked systems derived from Turing's reaction-diffusion model, a classical framework for understanding self-organization and pattern formation in biological systems. The eigenstructure of the diffusion matrix is fully characterized, and by introducing  symmetric control sets, we establish the necessary and sufficient graph-theoretic condition that guarantees controllability of  diffusion systems over networks of arbitrary size and parameters. These results are further extended to the reaction-diffusion systems.

After studying network controllability, Paper B extends the analysis to energy-efficient control placement in networked systems. By classifying network symmetries and exploiting symmetric control combinations, we develop a method that enables efficient computation of the spectrum of the controllability Gramian through lower-dimensional representations. This approach is further generalized to non-symmetric cases, where upper and lower spectral bounds are derived. Moreover, by utilizing the trace of the controllability Gramian as the objective, we propose a closed-form algorithm for optimizing control placement under constraints of limited control inputs and system controllability, with simulations validating its effectiveness.

Paper C addresses inverse optimal control for continuous-time linear quadratic regulators over finite-time horizons, namely the reconstruction of unknown cost matrices R, Q, and F in the objective function  from observed optimal control trajectories. The underlying linear system is assumed to be known. Both problem settings where R is either unknown or given are investigated. Firstly, two methods are developed to reconstruct R: one that leverages the full trajectory of the optimal feedback matrix and provides the necessary and sufficient conditions for uniqueness, and another that relies only on selected time points to reduce computational burden, which is particularly effective if F is given as positive definite. Secondly, when R is given, we investigate the role of system controllability in determining the well-posedness of the inverse problems. This assumption is subsequently relaxed, and sufficient conditions are established to ensure well-posedness, along with explicit analytical expressions for Q and F. Finally, the structural equivalence between IOC problems with unknown and given R is characterized under certain circumstances.

Paper D investigates inverse reinforcement learning (IRL) to reconstruct the unknown cost function in a model-free setting, where system dynamics are also unknown. Conventional IRL algorithms often require on-policy data collection and bi-level optimization, which impose potential practical limitations. To overcome these challenges, we propose a direct and adaptive IRL algorithm that learns from off-policy data satisfying only a mild persistence of excitation condition. By employing Nesterov-Todd (NT) step primal-dual interior-point iterations, the cost parameter is updated through simple one-step recursions, avoiding repeated forward RL computations. Theoretical analysis quantifies the impact of system noise and establishes sublinear convergence of the proposed algorithm. This method is further generalized to nonlinear objective functions via differential dynamic programming, where gradients of the loss function are computed through a backward pass. Numerical simulations demonstrate the efficiency and effectiveness of the proposed approach.

Abstract [sv]

I denna avhandling studeras två typer av problem inom systemteori: analys och optimering av nätverkskopplade dynamiska system, samt rekonstruktion av okända kostnadsfunktioner inom styrning och lärande. Dessa problem uppstår naturligt i en mängd olika tillämpningar, från tekniska system till naturliga fenomen.

Den första problemtypen handlar om effektiv styrning och optimering över storskaliga nätverkssystem. Först undersöks problemet att säkerställa styrbarhet samtidigt som antalet styrsignaler minimeras. Dessutom studeras optimal placering av styrsignaler under en begränsad styrbudget, för att förbättra energieffektiviteten och minska beräknings- och implementationskostnaderna. Den andra problemtypen berör inversa problem inom optimal styrning, där syftet är att rekonstruera okända kostnadsfunktioner från observerat beteende. Dessa problem är värdefulla för att avslöja de mål som ligger bakom komplexa system i naturen och i samhället. Båda fallen beaktas: när systemdynamiken är känd a priori och när den är okänd.

Artikel A undersöker optimal placering av styrsignaler för nätverkssystem som följer Turings reaktions–diffusionsmodell, ett klassiskt ramverk för att förstå självorganisation och mönsterbildning i biologiska system. Vi ger en fulltsändig karakterisering av diffusionsmatrisens egenstruktur, inför symmetriska kontrollmängder som uppfyller nödvändiga och tillräckliga grafteoretiska villkor, och garanterar styrbarhet för diffusionssystem över nätverk av godtycklig storlek och för ett godtyckligt antal parametrar. Dessa resultat utvidgas vidare till reaktions–diffusionssystem.

Artikel B utvidgar analysen till energieffektiv placering av styrsignaler i nätverksbaserade system. Genom att klassificera nätverkssymmetrier och utnyttja symmetriska styrkombinationer utvecklar vi en metod som möjliggör effektiv beräkning av spektrumet för kontrollbarhetsgramianen genom representationer i lägre dimensioner. Detta angreppssätt generaliseras vidare till osymmetriska fall, där övre och nedre spektrala gränser härleds. Vidare, genom att använda matrisspåret av kontrollbarhetsgramianen som målfunktion, föreslår vi en algoritm på sluten form för att optimera placering av styrsignaler under en begränsad styrbudget, för system med begränsad kontrollerbarhet. Vi gör även simuleringar som bekräftar metodens effektivitet.

Artikel C behandlar invers optimal styrning för kontinuerliga linjära kvadratiska regulatorer över ändliga tidshorisonter, med fokus på rekonstruktion av de okända kostnadsmatriserna R, Q och F i målfunktionen - baserat på observerade optimala styrbanor. Det underliggande linjära systemet antas vara känt. Både fallet där R är okänt och där R är givet analyseras. Först utvecklas två metoder för rekonstruktion av R: en som utnyttjar den fullständiga banan av den optimala återkopplingsmatrisen och ger nödvändiga och tillräckliga villkor för entydighet, samt en som enbart baseras på utvalda tidpunkter för att reducera beräkningskomplexiteten, vilket är särskilt effektivt om F är positivt definit. Sedan undersöks villkor för inversproblemets välställdhet då R är känd. Inledningsvis antas att systemet är styrbart, men sedan relaxeras detta antagande och ersätts med en uppsättning tillräckliga villkor, tillsammans med explicita analytiska uttryck för Q och F. Slutligen karakteriseras den strukturella ekvivalensen mellan inversproblem för optimal styrning med okänt respektive givet känt värde på R, under vissa förutsättningar.

Artikel D undersöker invers förstärkningsinlärning (IRL) som en metod för att rekonstruera den okända kostnadsfunktionen i en modellfri miljö, där systemdynamiken också är okänd. Konventionella IRL-algoritmer kräver ofta on-policy datainsamling och så kallad bi-level-optimering, vilket medför potentiella praktiska begränsningar. För att övervinna dessa utmaningar föreslår vi en direkt och adaptiv IRL-algoritm som lär sig från off-policy data som endast uppfyller ett svagt ”persistence of excitation”-villkor. Genom att använda Nesterov–Todd (NT)-stegs primal-duala inre punkts-iterationer uppdateras kostnadsparametern via enkla enstegsrekursioner, vilket undviker upprepade framåtriktade RL-beräkningar. Den teoretiska analysen kvantifierar inverkan av systembrus och fastställer sublinjär konvergens för den föreslagna algoritmen. Metoden generaliseras vidare till icke-linjära målfunktioner via differential dynamisk programmering, där gradienterna av förlustfunktionen beräknas genom en bakåtriktad passering. Numeriska simuleringar visar metodens effektivitet och prestanda.

Place, publisher, year, edition, pages
Stockholm, Sweden: KTH Royal Institute of Technology, 2025. p. 187
Series
TRITA-SCI-FOU ; 2025:66
Keywords
Networked control systems, Laplacian networks, Minimal control placement, Inverse optimal control, Linear quadratic regulators, Differential riccati equations, Inverse reinforcement learning, Differential dynamic programming, Nätverksstyrda system, Laplacenätverk, Minimal styrningsplacering, Invers optimal styrning, Linjär kvadratisk regulator, Differential Riccati-ekvationer, Invers förstärkningsinlärning, Differential dynamisk programmering
National Category
Computational Mathematics
Research subject
Applied and Computational Mathematics; Applied and Computational Mathematics, Optimization and Systems Theory
Identifiers
urn:nbn:se:kth:diva-372694 (URN)978-91-8106-473-5 (ISBN)
Public defence
2025-12-05, Kollegiesalen, Brinellvägen 8, Stockholm, 10:00 (English)
Opponent
Supervisors
Note

QC 2025-11-13

Available from: 2025-11-13 Created: 2025-11-12 Last updated: 2025-12-01Bibliographically approved
Cao, Y., Li, Y., Zheng, L. & Hu, X. (2024). Minimal Control Placement Of Networked Reaction-Diffusion Systems Based On Turing Model. SIAM Journal of Control and Optimization, 62(3), 1809-1831
Open this publication in new window or tab >>Minimal Control Placement Of Networked Reaction-Diffusion Systems Based On Turing Model
2024 (English)In: SIAM Journal of Control and Optimization, ISSN 0363-0129, E-ISSN 1095-7138, Vol. 62, no 3, p. 1809-1831Article in journal (Refereed) Published
Abstract [en]

In this paper, we consider the problem of placing a minimal number of controls to achieve controllability for a class of networked control systems that are based on the original Turing reaction-diffusion model, which is governed by a set of ordinary differential equations with interactions defined by a ring graph. Turing model considers two morphogens reacting and diffusing over the spatial domain and has been widely accepted as one of the most fundamental models to explain pattern formation in a developing embryo. It is of great importance to understand the mechanism behind the various reaction kinetics that generate such a wide range of patterns. As a first step towards this goal, in this paper we study controllability of Turing model for the case of cells connected as a square grid in which controls can be applied to the boundary cells. We first investigate the minimal control placement problem for the diffusion only system. The eigenvalues of the diffusion matrix are classified by their geometric multiplicity, and the properties of the corresponding eigenspaces are studied. The symmetric control sets are designed to categorize control candidates by symmetry of the network topology. Then the necessary and sufficient condition is provided for placing the minimal control to guarantee controllability for the diffusion system. Furthermore, we show that the necessary condition can be extended to Turing model by a natural expansion of the symmetric control sets. Under certain circumstances, we prove that it is also sufficient to ensure controllability of Turing model.

Place, publisher, year, edition, pages
Society for Industrial & Applied Mathematics (SIAM), 2024
Keywords
Turing model, controllability of networked systems, minimal control placement
National Category
Control Engineering
Identifiers
urn:nbn:se:kth:diva-352533 (URN)10.1137/23M1616856 (DOI)001289001000005 ()2-s2.0-85200923918 (Scopus ID)
Note

QC 20240903

Available from: 2024-09-03 Created: 2024-09-03 Last updated: 2025-11-12Bibliographically approved
Cao, Y., Li, Y., Zou, Z. & Hu, X. (2024). Spectrum computation and optimization for controllability Gramian of networked Laplacian systems with limited control placement. Systems & control letters (Print), 193, Article ID 105945.
Open this publication in new window or tab >>Spectrum computation and optimization for controllability Gramian of networked Laplacian systems with limited control placement
2024 (English)In: Systems & control letters (Print), ISSN 0167-6911, E-ISSN 1872-7956, Vol. 193, article id 105945Article in journal (Refereed) Published
Abstract [en]

This paper investigates the problem of placing a given number of controls to optimize energy efficiency for a family of linear dynamical systems, whose structure is induced by the Laplacian of a square-grid network. To quantify the performance of control combinations, several metrics have been proposed based on the spectrum of the controllability Gramian. But commonly used algorithms to compute the spectrum are usually time-consuming. In this paper, we first classify five anchor symmetries of the network systems. Then motivated by various advantages of symmetric control combinations, we provide a method to compute the eigenvalues and eigenvectors of their controllability Gramians more efficiently. Specifically, we show that they can be expressed by those of two lower-dimensional matrices. Furthermore, our method can be applied for non-symmetric cases to provide upper and lower bounds for the spectrum of the controllability Gramians. Finally, by employing the sum of eigenvalues, i.e., the trace of controllability Gramian, as the objective function, we provide a closed-form algorithm to the spectrum optimization problem with a given number of controls subject to system controllability.

Place, publisher, year, edition, pages
Elsevier B.V., 2024
Keywords
Control placement, Gramian spectrum, Network controllability, Trace maximization
National Category
Control Engineering Communication Systems
Identifiers
urn:nbn:se:kth:diva-355471 (URN)10.1016/j.sysconle.2024.105945 (DOI)001343769900001 ()2-s2.0-85207062480 (Scopus ID)
Note

QC 20241119

Available from: 2024-10-30 Created: 2024-10-30 Last updated: 2025-11-12Bibliographically approved
Cao, Y., Li, Y., Liu, Z., Zheng, L. & Hu, X. (2023). Minimal Control Placement of Turing's Model Using Symmetries. In: 2023 62nd IEEE Conference on Decision and Control, CDC 2023: . Paper presented at 62nd IEEE Conference on Decision and Control, CDC 2023, Singapore, Singapore, Dec 13 2023 - Dec 15 2023 (pp. 1456-1461). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Minimal Control Placement of Turing's Model Using Symmetries
Show others...
2023 (English)In: 2023 62nd IEEE Conference on Decision and Control, CDC 2023, Institute of Electrical and Electronics Engineers (IEEE) , 2023, p. 1456-1461Conference paper, Published paper (Refereed)
Abstract [en]

In this paper, the minimal control placement prob-lem for Turing's reaction-diffusion model is studied. Turing's model describes the process of morphogens diffusing and reacting with each other and is considered as one of the most fundamental models to explain pattern formation in a devel-oping embryo. Controlling pattern formation artificially has gained increasing attention in the field of development biology, which motivates us to investigate this problem mathematically. In this work, the two-dimensional Turing's reaction-diffusion model is discretized into square grids. The minimal control placement problem for the diffusion system is investigated first. The symmetric control sets are defined based on the symmetry of the network structure. A necessary condition is provided to guarantee controllability. Under certain circumstances, we prove that this condition is also sufficient. Then we show that the necessary condition can also be applied to the reaction-diffusion system by means of suitable extension of the symmetric control sets. Under similar circumstances, a sufficient condition is given to place the minimal control for the reaction-diffusion system.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
National Category
Control Engineering
Identifiers
urn:nbn:se:kth:diva-343738 (URN)10.1109/CDC49753.2023.10384275 (DOI)001166433801035 ()2-s2.0-85184831477 (Scopus ID)
Conference
62nd IEEE Conference on Decision and Control, CDC 2023, Singapore, Singapore, Dec 13 2023 - Dec 15 2023
Note

QC 20240222

Part of ISBN 9798350301243

Available from: 2024-02-22 Created: 2024-02-22 Last updated: 2024-04-05Bibliographically approved
Cao, Y., Li, Y., Zheng, L. & Hu, X. (2022). Network Controllability of Turing Reaction and Diffusion Model. In: Li, Z Sun, J (Ed.), 2022 41St Chinese Control Conference (Ccc): . Paper presented at 41st Chinese Control Conference (CCC), JUL 25-27, 2022, Hefei, PEOPLES R CHINA (pp. 259-264). IEEE
Open this publication in new window or tab >>Network Controllability of Turing Reaction and Diffusion Model
2022 (English)In: 2022 41St Chinese Control Conference (Ccc) / [ed] Li, Z Sun, J, IEEE , 2022, p. 259-264Conference paper, Published paper (Refereed)
Abstract [en]

In this paper, the controllability problem of the reaction-diffusion (RD) model, or Turing model is studied. Turing model provides a valuable framework for self-organized system and has been widely used to explain to pattern formation in the real life. With the rapidly development of the biology technology, biologists are trying to control the pattern formation artificially and has achieved some progress. However, the influence that exerted on the pattern formation by the external factors, such as light and temperature, remains to be solved. In this work, The RD model is obtained following the assumptions in Turing's original paper and spatially discretized into square grids. The nodes in the outermost layer are considered as candidates for control. Controllability of the RD system with all such nodes as control is first shown. Then controllability of the RD system with minimal number of control nodes is studied. Our results show that nearly 87.5% control nodes can be saved while the system is still controllable. Numerical simulations are provided to demonstrate the effects of controlling the reaction and diffusion of the morphogens.

Place, publisher, year, edition, pages
IEEE, 2022
Series
Chinese Control Conference, ISSN 2161-2927
Keywords
Reaction and diffusion system, network controllability, controllability with minimal complexity
National Category
Control Engineering
Identifiers
urn:nbn:se:kth:diva-326066 (URN)10.23919/CCC55666.2022.9902569 (DOI)000932071600046 ()2-s2.0-85140471069 (Scopus ID)
Conference
41st Chinese Control Conference (CCC), JUL 25-27, 2022, Hefei, PEOPLES R CHINA
Note

QC 20230425

Available from: 2023-04-25 Created: 2023-04-25 Last updated: 2024-08-28Bibliographically approved
Li, Y., Cao, Y., Liu, Z. & Xie, L.Adaptive Inverse Reinforcement Learning with Online Off-Policy Data Collection.
Open this publication in new window or tab >>Adaptive Inverse Reinforcement Learning with Online Off-Policy Data Collection
(English)Manuscript (preprint) (Other academic)
Abstract [en]

In this paper, the inverse reinforcement learning (IRL) problem is addressed to reconstruct the unknown cost function underlying an observed optimal policy in a model-free manner, whose online adaptation with completely off-policy system data still remains unclear in the literature. Without prior knowledge of the system model parameters, an adaptive and direct learning rule for the cost parameter is proposed using online off-policy system data, which only needs to satisfy the mild persistently exciting condition in the general data-driven paradigm. The adaptive and online IRL algorithm is achieved by designing full Nesterov-Todd (NT)-step primal-dual interior-point iterations.  Despite solving a nonlinear and time-varying semi-definite program (SDP), the influence of system noise is rigorously analyzed, and the proposed online algorithm is shown to achieve a sublinear convergence. The proposed method is further generalized to nonlinear IRL based on differential dynamic programming. The gradient of the loss function is directly obtained via a backward pass, which eliminates the need to repeatedly solve forward RL problems as in conventional bi-level IRL frameworks. Finally, the efficiency and effectiveness of the proposed algorithms are demonstrated by numerical examples.

Keywords
Inverse reinforcement learning; linear quadratic regulator; differential dynamic programming; online semidefinite programming.
National Category
Control Engineering
Identifiers
urn:nbn:se:kth:diva-372693 (URN)
Note

QC 20251204

Available from: 2025-11-12 Created: 2025-11-12 Last updated: 2025-12-04Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0009-0004-0091-0810

Search in DiVA

Show all publications