kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Identifiability and Solvability in Inverse Linear Quadratic Optimal Control Problems
KTH, School of Engineering Sciences (SCI), Mathematics (Dept.), Optimization and Systems Theory.ORCID iD: 0000-0001-7287-1495
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Decision and Control Systems (Automatic Control).ORCID iD: 0000-0002-1927-1690
KTH, School of Engineering Sciences (SCI), Mathematics (Dept.), Optimization and Systems Theory.ORCID iD: 0000-0003-0177-1993
2021 (English)In: Journal of Systems Science and Complexity, ISSN 1009-6124, E-ISSN 1559-7067, Vol. 34, no 5, p. 1840-1857Article in journal (Refereed) Published
Abstract [en]

In this paper, the inverse linear quadratic (LQ) problem over finite time-horizon is studied. Given the output observations of a dynamic process, the goal is to recover the corresponding LQ cost function. Firstly, by considering the inverse problem as an identification problem, its model structure is shown to be strictly globally identifiable under the assumption of system invertibility. Next, in the noiseless case a necessary and sufficient condition is proposed for the solvability of a positive semidefinite weighting matrix and its unique solution is obtained with two proposed algorithms under the condition of persistent excitation. Furthermore, a residual optimization problem is also formulated to solve a best-fit approximate cost function from sub-optimal observations. Finally, numerical simulations are used to demonstrate the effectiveness of the proposed methods.

Place, publisher, year, edition, pages
Springer Nature , 2021. Vol. 34, no 5, p. 1840-1857
Keywords [en]
Inverse optimal control, linear quadratic regulators, model identifiability
National Category
Control Engineering
Identifiers
URN: urn:nbn:se:kth:diva-304782DOI: 10.1007/s11424-021-1245-3ISI: 000711413600013Scopus ID: 2-s2.0-85117961728OAI: oai:DiVA.org:kth-304782DiVA, id: diva2:1612576
Note

QC 20211118

Available from: 2021-11-18 Created: 2021-11-18 Last updated: 2022-06-25Bibliographically approved
In thesis
1. Inverse and Forward Approaches for Optimal Control and Estimation in Agent-Based Systems
Open this publication in new window or tab >>Inverse and Forward Approaches for Optimal Control and Estimation in Agent-Based Systems
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This dissertation is concerned with three topics within the field of optimal control and estimation in dynamical agent-based systems, with potential applications that meet both engineering and societal needs. Firstly, the inverse optimal control problem is studied. Given a dynamical system, the goal is to recover the underlying cost function from observations of the optimal state trajectories. Such recovery of cost functions will not only help us develop a better understanding of natural and societal phenomena, but also provide a criterion to design optimal controllers in similar contexts. Secondly, we further study synthesis of collective emergence in multi-agent systems. The problem is fit into a game theoretical framework based on modeling the strategic interactions among self-oriented agents. In this thesis the specific topic of intrinsic formation control is addressed, in which designing individual cost functions to realize desired optimal emergence is a critical issue. Finally, topics of distributed coordination are also considered for societal systems, or more specifically, in mathematical finance. The credit scoring problem is studied by incorporating dynamical networked information.

Specifically, in Paper A and Paper C, the finite-horizon inverse optimal control problem is studied for continuous-time systems, with full or partial state observations. Although the infinite-horizon inverse linear quadratic problem is well-studied with numerous results, the finite-horizon case is still an open problem. To the best of our knowledge, our result is the first complete result on necessary and sufficient conditions for the solvability of such inverse problem. The uniqueness of solutions is studied and the equivalence class of cost functions is derived. In addition, based on system invertibility a well-posed inverse problem is formulated even for the case in which the optimal synthesis can only be partially observed. As for suboptimal observations, residual optimization problems are solved to obtain a best-fit approximate cost function.

Paper B further studies the inverse optimal control problem in a stochastic set-up, where partial state observations of a discrete-time system are available under measurement noise. Firstly, by formulating the problem as a system identification task with the exact initial states as model excitations, its identifiability is justified under the relative degree assumption and statistical consistency is shown for the empirical estimation. Furthermore, as for more practical scenarios with imperfect initial states as well, the problem is fit into the framework of maximum likelihood estimation and is solved by Expectation Maximization algorithm under Gaussian assumptions. 

In Paper D, the intrinsic formation control problem of a multi-agent system is formulated as both finite- and infinite-horizon noncooperative differential games. The manifold of all equivalent configurations of the desired formation is studied by considering all orientations and agent permutations, whose convergence and stability are analyzed in both cases. The main novelty of our work lies in that the desired relative pattern is not predefined in the game, and is achieved intrinsically only via different choices of the communication topology of the multi-agent system without using formation errors in the controller, which can be hard to obtain in practice. Patterns of regular polyhedra and antipodal formations are achieved by Nash equilibria while inter-agent collisions are naturally avoided.

Paper E concerns the network-based credit scoring problem and the advantages of such incorporation are studied in two scenarios. Firstly, when the score publishing is merely individual-dependent, an optimal Bayesian filter is designed for risk prediction, which serves as a reference for the lender on future financial decisions. Secondly, a recursive Bayes estimator is proposed to further improve the accuracy of score publishing by incorporating the dynamical network topology as well. It is shown that under the proposed evolution framework, the designed biased estimator has a higher precision than any efficient estimator, and the mean square errors are strictly smaller than the Cramér-Rao lower bound for clients within a certain range of scores.

Abstract [sv]

Denna avhandling behandlar tre ämnen inom optimal styrteori och estimering för agentbaserade dynamiska system, med både ingenjörs- och samhälleliga tillämpningar. Först studeras det inversa problemet för optimal styrning. Där är målet att givet ett dynamiskt system återskapa den underliggande kostnadsfunktionen utifrån observationer av optimala trajektorier. Detta ger inte endast större förståelse för naturliga och samhälleliga fenomen, utan också kriterier för att bestämma optimala regulatorer för andra, liknande problem. Vidare studerar vi även framträdandet av emergens i system med flera agenter. Problemet behandlas utifrån ett spelteoretiskt ramverk som modellerar strategiska interaktioner mellan självinriktade agenter. I denna avhandling behandlas särskilt formering (immanent formationsreglering), där ett särskilt problem är att bestämma individuella kostnadsfunktioner för att uppnå optimal emergens. Slutligen behandlas även distribuerad samordning av samhälleliga, specifikt finansiella, system. Kreditvärderingsproblemet studeras genom att ta in information från ett dynamiskt nätverkssystem.

Det inversa problemet för optimal styrning av kontinuerliga system med finit tidshorisont studeras särskilt i artikel A och C, antingen med fullständiga eller partiella observationer av tillstånd. Trots att det linjärkvadratiska problemet med infinit tidshorisont har studerats omfattande, är motsvarande problem med finit tidshorisont till stor del olöst. Vad vi vet är vårt resultat det första fullständiga gällande nödvändiga och tillräckliga villkor för att ett sådant inverst problem ska vara lösbart. Vi behandlar lösningars unikhet och härleder kostnadsfunktionernas ekvivalensklass. Dessutom ställer vi, utifrån systemets inverterbarhet, upp ett välformulerat problem även för fallet då optimal syntes endast delvis kan observeras. För icke-optimala observationer löser vi minimeringsproblem av residualer för att ta bäst skatta kostnadsfunktionen.

Vidare behandlar artikel B the inversa problemet för optimal styrning utifrån en stokastisk modell, där partiella tillståndsobservationer görs med ett slumpmässigt mätfel. Först formuleras problemet som ett identifieringsproblem där de exakta initiala tillstånden (används att) excitera modellen, vars identifierbarhet motiveras under ett antagande om systemets relativa grad och vars statistiska konsistens visas vid empirisk estimering. Sedan anpassas problemet till maximum likelihoodestimering för att behandla mer praktiska scenarier med icke-exakta initiala tillstånd. Problemet löses då med en väntevärdesmaximerande algoritm under antaganden om Gaussiska sannolikhetsfördelningar.

I artikel D formuleras formeringsproblemet för system med flera agenter som ett differentiellt spel utan samarbete för både finit och infinit tidshorisont. Mångfalden av alla ekvivalenta konfigurationer av önskade formationer studeras genom att ta hänsyn till alla permutationer av riktningar och agenter, vars konvergens och stabilitet analyseras i båda fallen. Det nya i vårt arbete ligger framför allt i att den önskade formationen inte definieras i förväg i spelet, utan uppnås enbart genom valet av systemets kommunikationstopologi utan användning av formeringsfelet i regulatorn, vilket annars kan vara svårt att få information om. Regelbundna polyedrar och antipodiska formationer uppnås genom Nashjämvikt medan kollisioner agenter emellan undviks naturligt.

Artikel E berör nätverksbaserad kreditvärdering och fördelen av nätverksbaserad information studeras i två scenarier. Först, då kreditvärderingen endast är individberoende, bestäms ett optimalt Bayesiskt filter för riskpredicering vilket används som referens för långivaren för framtida finansiella beslut. Sedan föreslås en rekursiv Bayesisk estimator för att ytterligare förbättra kreditvärderingen genom att också ta hänsyn till den dynamiska nätverkstopologin. Vi visar att den föreslagna, icke-väntevärdesriktiga, estimatorn har högre precision än någon effektiv estimator i det föreslagna utvärderingsramverket (evolutionsramverket), och att de genomsnittliga kvadratfelen är strikt mindre än Cramér-Raos undre begränsning för kunder inom ett särskilt kreditvärdighetsintervall.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2022. p. 196
Series
TRITA-SCI-FOU ; 2022;19
Keywords
Autonomous systems; inverse optimal control; system identification; nonlinear systems; formation control; differential game; credit scoring., Autonoma system; invers optimal styrning; system identifiering; olinjära system; formationsstyrning; differentialspel; kreditvärdering.
National Category
Computational Mathematics
Research subject
Applied and Computational Mathematics, Optimization and Systems Theory
Identifiers
urn:nbn:se:kth:diva-311742 (URN)978-91-8040-233-0 (ISBN)
Public defence
2022-06-02, F3, Lindstedsvägen 26, KTH, Stockholm, 14:00 (English)
Opponent
Supervisors
Note

QC 220504

Available from: 2022-05-04 Created: 2022-05-03 Last updated: 2023-01-02Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Li, YibeiWahlberg, BoHu, Xiaoming

Search in DiVA

By author/editor
Li, YibeiWahlberg, BoHu, Xiaoming
By organisation
Optimization and Systems TheoryDecision and Control Systems (Automatic Control)
In the same journal
Journal of Systems Science and Complexity
Control Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 163 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf