kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Efficient and Trustworthy Artificial Intelligence for Critical Robotic Systems
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0003-4943-2501
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Critical robotic systems are systems whose functioning is critical to both ensuring the accomplishment of a given mission and preventing the endangerment of life and the surrounding environment. These critical aspects can be formally captured by convergence, in the sense that the system's state goes to a desired region of the statespace, and safety, in the sense that the system's state avoids unsafe regions of the statespace. Data-driven control policies, found through e.g. imitation learning or reinforcement learning, can outperform model-based methods in achieving convergence and safety efficiently; however, they often only do so by encouraging them, thus, they can be difficult to trust. Model-based control policies, on the other hand, are often well-suited to admitting formal guarantees of convergence and safety, thus they are often easier to trust. The main question asked in this thesis is: how can we compose data-driven and model-based control policies together to encourage efficiency while, at the same time, formally guaranteeing convergence and safety?

We answer this question with behaviour trees, a framework to represent hybrid control systems in a modular way. We present the first formal definition of behaviour trees as a hybrid system and present the conditions under which the execution of any behaviour tree as a hybrid control system will formally guarantee convergence and safety. Moreover, we present the conditions under which such formal guarantees can be maintained when including unguaranteed data-driven control policies, such as those coming from imitation learning or reinforcement learning. We also present an approach to synthesise such data-driven control policies in such a way that they encourage convergence and safety by adapting to unforeseen events. Alongside the above, we also explore an ancillary aspect of robot autonomy by improving the efficiency of simultaneous localisation and mapping through imitation learning. Lastly, we validate the advantages of behaviour trees' modularity in a real-world autonomous underwater vehicle's control system, and argue that this modularity contributes to efficiency, in terms of ease of use, and trust, in terms of facilitating human understanding.

Abstract [sv]

Kritiska robotsystem är system vars funktion antingen är kritiska för slutförandet av en uppgift, eller kritiska på så sätt att ett misstag allvarligt kan skada människor eller miljö. Dessa kritiska aspekter fångas formellt av konvergens, i den meningen att systemets tillstånd går till en önskad region av tillståndsrummet, och säkerhet, i den meningen att systemets tillstånd undviker osäkra regioner i tillståndsrummet. Datadrivnakontrollpolicyer, hittade genom t.ex. imitationsinlärning eller förstärkningsinlärning, kan överträffa modellbaserade metoder för att effektivt uppnå konvergens och säkerhet; men de gör det ofta bara genom att öka möjligheterna för ett effektivt och säkert uppträdande, utan att ge några garantier, därför kan de vara svåra att lita på. Modellbaserade kontrollpolicyer, å andra sidan, är ofta väl lämpade för att möjliggöra formella garantier vad gäller konvergens och säkerhet, så de är ofta lättare att lita på. Huvudfrågan som ställs i denna avhandling är: hur kan vi kombinera datadrivna och modellbaserade styrpolicyer för att förbättra effektivitet samtidigt som vi formellt garanterar konvergens och säkerhet?

 

Vi besvarar denna fråga med Beteendeträd, ett ramverk för att representera hybridstyrsystem på ett modulärt sätt. Vi presenterar den första formella definitionen av beteendeträd som ett hybridsystem och presenterar villkoren under vilka exekveringen av ett beteendeträd som ett hybridkontrollsystem formellt kommer att garantera konvergens och säkerhet. Dessutom presenterar vi villkoren under vilka sådana formella garantier kan upprätthållas när man inkluderar overifierade datadrivna kontrollpolicyer, till exempel de som kommer från imitationsinlärning eller förstärkningsinlärning. Vi presenterar också ett tillvägagångssätt för att syntetisera sådana datadrivna kontrollpolicyer på ett sådant sätt att de stöttar konvergens och säkerhet genom att anpassa sig till oförutsedda händelser. Vid sidan av ovanstående utforskar vi också en viktig delfunktion inom robotautonomi genom att förbättra effektiviteten av samtidig lokalisering och kartläggning genom imitationsinlärning. Slutligen validerar vi fördelarna med behaviour trees modularitet i ett verkligt autonomt undervattensfordons kontrollsystem, och ser att denna modularitet bidrar till effektivitet, i termer av användarvänlighet och förtroende, när det gäller att underlätta mänsklig förståelse.

Place, publisher, year, edition, pages
Stockholm: Kungliga Tekniska högskolan, 2022. , p. 41
Series
TRITA-EECS-AVL ; 2022:68
Keywords [en]
behaviour trees, hybrid dynamical systems, formal guarantees, optimal control, machine learning, autonomy
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-321151ISBN: 978-91-8040-396-2 (print)OAI: oai:DiVA.org:kth-321151DiVA, id: diva2:1709018
Public defence
2022-11-29, F3, Lindstedtsvägen 26, Stockholm, 14:00 (English)
Opponent
Supervisors
Funder
Swedish Foundation for Strategic Research, IRC15-0046
Note

QC 20221107

Available from: 2022-11-07 Created: 2022-11-07 Last updated: 2022-12-23Bibliographically approved
List of papers
1. Continuous-Time Behavior Trees as Discontinuous Dynamical Systems
Open this publication in new window or tab >>Continuous-Time Behavior Trees as Discontinuous Dynamical Systems
2022 (English)In: IEEE Control Systems Letters, E-ISSN 2475-1456, Vol. 6, p. 1891-1896Article in journal (Refereed) Published
Abstract [en]

Behavior trees represent a hierarchical and modular way of combining several low-level control policies into a high-level task-switching policy. Hybrid dynamical systems can also be seen in terms of task switching between different policies, and therefore several comparisons between behavior trees and hybrid dynamical systems have been made, but only informally, and only in discrete time. A formal continuous-time formulation of behavior trees has been lacking. Additionally, convergence analyses of specific classes of behavior tree designs have been made, but not for general designs. In this letter, we provide the first continuous-time formulation of behavior trees, show that they can be seen as discontinuous dynamical systems (a subclass of hybrid dynamical systems), which enables the application of existence and uniqueness results to behavior trees, and finally, provide sufficient conditions under which such systems will converge to a desired region of the state space for general designs. With these results, a large body of results on continuous-time dynamical systems can be brought to use when designing behavior tree controllers.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2022
Keywords
Convergence, Dynamical systems, Tools, Task analysis, Service robots, Metadata, Control theory, Autonomous systems, behavior trees, stability of hybrid systems, switched systems
National Category
Robotics and automation Control Engineering
Identifiers
urn:nbn:se:kth:diva-306851 (URN)10.1109/LCSYS.2021.3134453 (DOI)000733213300016 ()2-s2.0-85121363464 (Scopus ID)
Note

QC 20220104

Available from: 2022-01-04 Created: 2022-01-04 Last updated: 2025-02-05Bibliographically approved
2. Adding Neural Network Controllers to Behavior Trees without Destroying Performance Guarantees
Open this publication in new window or tab >>Adding Neural Network Controllers to Behavior Trees without Destroying Performance Guarantees
2022 (English)In: The 61th IEEE Conference on Decision and Control (CDC 2022) / [ed] IEEE, 2022Conference paper, Published paper (Refereed)
Abstract [en]

    In this paper, we show how Behavior Trees that have performance guarantees, in terms of safety and goal convergence, can be extended with components that were designed using machine learning, without destroying those performance guarantees.

    Machine learning approaches such as reinforcement learning or learning from demonstration can be very appealing to AI designers that want efficient and realistic behaviors in their agents. However, those algorithms seldom provide guarantees for solving the given task in all different situations while keeping the agent safe. Instead, such guarantees are often easier to find for manually designed model-based approaches. In this paper we exploit the modularity of behavior trees to extend a given design with an efficient, but possibly unreliable, machine learning component in a way that preserves the guarantees.    The approach is illustrated with an inverted pendulum example.

Keywords
Autonomous systems, behavior trees, stability of hybrid systems, switched systems
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-320828 (URN)
Conference
The 61th IEEE Conference on Decision and Control (CDC 2022)
Funder
Swedish Foundation for Strategic Research, IRC15-0046
Note

QC 20221108

Available from: 2022-11-01 Created: 2022-11-01 Last updated: 2022-11-08Bibliographically approved
3. An Extended Convergence Result for Behavior Tree Controllers
Open this publication in new window or tab >>An Extended Convergence Result for Behavior Tree Controllers
(English)Manuscript (preprint) (Other academic)
Abstract [en]

    Behavior trees (BTs) is an optimally modular framework to assemble hierarchical hybrid control policies from a set of low-level control policies using a tree structure.    Many robotic tasks are naturally decomposed into a hierarchy of control tasks, and modularity is a well-known tool for handling complexity, therefor behavior trees have garnered widespread usage in the robotics community.    In this paper, we study the convergence of BTs, in the sense of reaching a desired part of the state space.    Earlier results on BT convergence were often tailored to specific families of BTs, created using different design principles.    The results of this paper generalize the earlier results, and also include new cases of cyclic switching not covered in the literature.

Keywords
Behavior-Based Systems, Robot Safety, Control Architectures and Programming
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-320830 (URN)
Funder
Swedish Foundation for Strategic Research, IRC15-0046
Note

QC 20221108

Available from: 2022-11-01 Created: 2022-11-01 Last updated: 2022-11-08Bibliographically approved
4. Improving the Modularity of AUV Control Systems using Behaviour Trees
Open this publication in new window or tab >>Improving the Modularity of AUV Control Systems using Behaviour Trees
Show others...
2018 (English)In: AUV 2018 - 2018 IEEE/OES Autonomous Underwater Vehicle Workshop, Proceedings, Institute of Electrical and Electronics Engineers Inc. , 2018Conference paper, Published paper (Refereed)
Abstract [en]

In this paper, we show how behaviour trees (BTs) can be used to design modular, versatile, and robust control architectures for mission-critical systems. In particular, we show this in the context of autonomous underwater vehicles (AUVs). Robustness, in terms of system safety, is important since manual recovery of AUVs is often extremely difficult. Further more, versatility is important to be able to execute many different kinds of missions. Finally, modularity is needed to achieve a combination of robustness and versatility, as the complexity of a versatile systems needs to be encapsulated in modules, in order to create a simple overall structure enabling robustness analysis. The proposed design is illustrated using a typical AUV mission.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2018
Keywords
artificial intelligence, autonomous underwater vehicles, behaviour trees, robotic planning, Autonomous vehicles, Forestry, Intelligent robots, Robot programming, Robust control, Robustness (control systems), Control architecture, Mission critical systems, Of autonomous underwater vehicles, Robustness analysis, System safety, Versatile system
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-262469 (URN)10.1109/AUV.2018.8729810 (DOI)000492901600108 ()2-s2.0-85068333136 (Scopus ID)9781728102535 (ISBN)
Conference
2018 IEEE/OES Autonomous Underwater Vehicle Workshop, AUV 2018, 6 November 2018 through 9 November 2018, Porto, Portugal
Note

QC 20191017

Available from: 2019-10-17 Created: 2019-10-17 Last updated: 2025-02-09Bibliographically approved
5. PointNetKL: Deep Inference for GICP Covariance Estimation in Bathymetric SLAM
Open this publication in new window or tab >>PointNetKL: Deep Inference for GICP Covariance Estimation in Bathymetric SLAM
2020 (English)In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 5, no 3, p. 4078-4085Article in journal (Refereed) Published
Abstract [en]

Registration methods for point clouds have become a key component of many SLAM systems on autonomous vehicles. However, an accurate estimate of the uncertainty of such registration is a key requirement to a consistent fusion of this kind of measurements in a SLAM filter. This estimate, which is normally given as a covariance in the transformation computed between point cloud reference frames, has been modelled following different approaches, among which the most accurate is considered to be the Monte Carlo method. However, a Monte Carlo approximation is cumbersome to use inside a time-critical application such as online SLAM. Efforts have been made to estimate this covariance via machine learning using carefully designed features to abstract the raw point clouds. However, the performance of this approach is sensitive to the features chosen. We argue that it is possible to learn the features along with the covariance by working with the raw data and thus we propose a new approach based on PointNet. In this work, we train this network using the KL divergence between the learned uncertainty distribution and one computed by the Monte Carlo method as the loss. We test the performance of the general model presented applying it to our target use-case of SLAM with an autonomous underwater vehicle (AUV) restricted to the 2-dimensional registration of 3D bathymetric point clouds.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2020
Keywords
SLAM, novel deep learning methods, marine robotics, simultaneous localization and mapping, robot learning, unmanned underwater vehicles
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-276602 (URN)10.1109/LRA.2020.2988180 (DOI)000536185200002 ()2-s2.0-85084935128 (Scopus ID)
Note

QC 20200623

Available from: 2020-06-23 Created: 2020-06-23 Last updated: 2025-02-07Bibliographically approved
6. Learning Dynamic-Objective Policies from a Class of Optimal Trajectories
Open this publication in new window or tab >>Learning Dynamic-Objective Policies from a Class of Optimal Trajectories
2020 (English)In: Proceedings of the IEEE Conference on Decision and Control, Institute of Electrical and Electronics Engineers Inc. , 2020, p. 597-602Conference paper, Published paper (Refereed)
Abstract [en]

Optimal state-feedback controllers, capable of changing between different objective functions, are advantageous to systems in which unexpected situations may arise. However, synthesising such controllers, even for a single objective, is a demanding process. In this paper, we present a novel and straightforward approach to synthesising these policies through a combination of trajectory optimisation, homotopy continuation, and imitation learning. We use numerical continuation to efficiently generate optimal demonstrations across several objectives and boundary conditions, and use these to train our policies. Additionally, we demonstrate the ability of our policies to effectively learn families of optimal state- feedback controllers, which can be used to change objective functions online. We illustrate this approach across two trajectory optimisation problems, an inverted pendulum swingup and a spacecraft orbit transfer, and show that the synthesised policies, when evaluated in simulation, produce trajectories that are near-optimal. These results indicate the benefit of trajectory optimisation and homotopy continuation to the synthesis of controllers in dynamic-objective contexts. 

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2020
Keywords
Deep Learning, Homotopy, Online Planning, Optimal Control, Aerodynamics, Feedback control, Orbits, State feedback, Homotopy continuation, Imitation learning, Inverted pendulum, Numerical continuation, Objective functions, Optimal state feedback, Optimal trajectories, Trajectory optimisation, Controllers
National Category
Control Engineering Robotics and automation
Identifiers
urn:nbn:se:kth:diva-301192 (URN)10.1109/CDC42340.2020.9303931 (DOI)000717663400077 ()2-s2.0-85099880092 (Scopus ID)
Conference
59th IEEE Conference on Decision and Control, CDC 2020, 14 December 2020 through 18 December 2020
Funder
Swedish Foundation for Strategic Research
Note

QC 20230307

Available from: 2021-09-08 Created: 2021-09-08 Last updated: 2025-02-05Bibliographically approved

Open Access in DiVA

Kappa(1241 kB)514 downloads
File information
File name FULLTEXT04.pdfFile size 1241 kBChecksum SHA-512
3adcf1e8fa95c7132ce9393b1add6e8a5ba1bd1e0a651b1dd95ba08cecffda38082dc11d5e0a335d7fed0f3f3dab6a6a49f730a7ea821aba58d526d74899fb9b
Type fulltextMimetype application/pdf

Authority records

Sprague, Christopher

Search in DiVA

By author/editor
Sprague, Christopher
By organisation
Robotics, Perception and Learning, RPL
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 517 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1199 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf