kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Abdul Khader, ShahbazORCID iD iconorcid.org/0000-0003-0443-7982
Publications (7 of 7) Show all publications
Abdul Khader, S. (2021). Data-Driven Methods for Contact-Rich Manipulation: Control Stability and Data-Efficiency. (Doctoral dissertation). Stockholm: KTH Royal Institute of Technology
Open this publication in new window or tab >>Data-Driven Methods for Contact-Rich Manipulation: Control Stability and Data-Efficiency
2021 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Autonomous robots are expected to make a greater presence in the homes and workplaces of human beings. Unlike their industrial counterparts, autonomous robots have to deal with a great deal of uncertainty and lack of structure in their environment. A remarkable aspect of performing manipulation in such a scenario is the possibility of physical contact between the robot and the environment. Therefore, not unlike human manipulation, robotic manipulation has to manage contacts, both expected and unexpected, that are often characterized by complex interaction dynamics.

Skill learning has emerged as a promising approach for robots to acquire rich motion generation capabilities. In skill learning, data driven methods are used to learn reactive control policies that map states to actions. Such an approach is appealing because a sufficiently expressive policy can almost instantaneously generate appropriate control actions without the need for computationally expensive search operations. Although reinforcement learning (RL) is a natural framework for skill learning, its practical application is limited for a number of reasons. Arguably, the two main reasons are the lack of guaranteed control stability and poor data-efficiency. While control stability is necessary for ensuring safety and predictability, data-efficiency is required for achieving realistic training times. In this thesis, solutions are sought for these two issues in the context of contact-rich manipulation.

First, this thesis addresses the problem of control stability. Despite unknown interaction dynamics during contact, skill learning with stability guarantee is formulated as a model-free RL problem. The thesis proposes multiple solutions for parameterizing stability-aware policies. Some policy parameterizations are partly or almost wholly deep neural networks. This is followed by policy search solutions that preserve stability during random exploration, if required. In one case, a novel evolution strategies-based policy search method is introduced. It is shown, with the help of real robot experiments, that Lyapunov stability is both possible and beneficial for RL-based skill learning.

Second, this thesis addresses the issue of data-efficiency. Although data-efficiency is targeted by formulating skill learning as a model-based RL problem, only the model learning part is addressed. In addition to benefiting from the data-efficiency and uncertainty representation of the Gaussian process, this thesis further investigates the benefits of adopting the structure of hybrid automata for learning forward dynamics models. The method also includes an algorithm for predicting long-term trajectory distributions that can represent discontinuities and multiple modes. The proposed method is shown to be more data-efficient than some state-of-the-art methods. 

Abstract [sv]

Autonoma robotar förväntas utgöra en allt större närvaro på människors arbetsplatser och i deras hem. Till skillnad från sina industriella motparter, behöver dessa autonoma robotar hantera en stor mängd osäkerhet och brist på struktur i sina omgivningar. En väsentlig del av att utföra manipulation i dylika scenarier, är förekomsten av fysisk interaktion med direkt kontakt mellan roboten och dess omgivning. Därför måste robotar, inte olikt människor, kunna hantera både förväntade och oväntade kontakter med omgivningen, som ofta karaktäriseras av komplex interaktionsdynamik.

Skill learning, eller inlärning av färdigheter, står ut som ett lovande alternativ för att låta robotar tillgodogöra sig en rik förmoga att generera rörelser. I Skill Learning används datadrivna metoder för att lära in en reaktiv policy, en reglerfunktion som kopplar tillstånd till styrsignaler. Detta tillvägagångssätt är tilltalande eftersom en tillräckligt uttrycksfull policy kan generera lämpliga styrsignaler nästan instantant, utan att behöva genomföra beräkningsmässigt kostsamma sökoperationer. Även om Reinforcement Learning (RL), förstärkningsinlärning, är ett naturligt ramverk för skill learning, har dess praktiska tillämpningar varit begräsade av ett antal anledningar. Det kan med fog påstås att de två främsta anledningarna är brist på garanterad stabilitet, och dålig dataeffektivitet. Stabilitet i reglerloopen är nödvändigt för att kunna garanterar säkerhet och förutsägbarhet, och dataeffektivitet behövs för att uppnå realistiska inlärningstider. I denna avhandling söker vi efter lösningar till dessa problem i kontexten av manipulation med rik förekomst av kontakter.

Denna avhandling behandlar först problemet med stabilitet. Trots at dynamiken för interaktionen är okänd vid förekomsten av kontakter, formuleras skill learning med stabilitetsgarantier som ett modelfritt RL-problem. Avhandlingen presenterar flera lösningar för att parametrisera stabilitetsmedvetna policys. Detta följs sedan av lösningar för att söka efter policys som är stabila under slumpmässig sökning, om detta behövs. Några parametriseringar bestå helt eller delvis av djupa neurala nätverk. I ett fall introduceras också en sökmetod baserad på evolution strategies. Vi visar, genom experiment på faktiska robotar, att lyaponovstabilitet är både möjligt och fördelaktigt vid RL-baserad skill learning.

Vidare tar avhandlingen upp dataeffektivitet. Även om dataeffektiviteten angrips genom att formulera skill learning som ett modellbaserat RL-problem, så behandlar vi endast delen med modellinlärning. Utöver att dra nytta av dataeffektiviteten och osäkerhetsrepresentationen i gaussiska processer, så undersöker avhandlingen även fördelarna med att använda strukturen hos hybrida automata för att lära in modeller för framåtdynamiken. Metoden innehåller även en algoritm för att förutsäga fördelningarna av trajektorier över en längre tidsrymd, för att representera diskontinuiteter och multipla moder. Vi visar att den föreslagna metodiken är mer dataeffektiv än ett antal existerande metoder.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2021. p. 63
Series
TRITA-EECS-AVL ; 49
Keywords
Robotic, Skill Learning, Reinforcement Learning, Contact-Rich Manipulation
National Category
Robotics and automation
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-299799 (URN)978-91-7873-937-0 (ISBN)
Public defence
2021-09-17, https://kth-se.zoom.us/j/68651867110, F3, Lindstedtsvägen 26, Stockholm, 14:00 (English)
Opponent
Supervisors
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20210823

Available from: 2021-08-18 Created: 2021-08-17 Last updated: 2025-02-09Bibliographically approved
Abdul Khader, S., Yin, H., Falco, P. & Kragic, D. (2021). Learning deep energy shaping policies for stability-guaranteed manipulation. IEEE Robotics and Automation Letters, 6(4), 8583-8590
Open this publication in new window or tab >>Learning deep energy shaping policies for stability-guaranteed manipulation
2021 (English)In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 6, no 4, p. 8583-8590Article in journal (Refereed) Published
Abstract [en]

Deep reinforcement learning (DRL) has been successfully used to solve various robotic manipulation tasks. However, most of the existing works do not address the issue of control stability. This is in sharp contrast to the control theory community where the well-established norm is to prove stability whenever a control law is synthesized. What makes traditional stability analysis difficult for DRL are the uninterpretable nature of the neural network policies and unknown system dynamics. In this work, stability is obtained by deriving an interpretable deep policy structure based on the energy shaping control of Lagrangian systems. Then, stability during physical interaction with an unknown environment is established based on passivity. The result is a stability guaranteeing DRL in a model-free framework that is general enough for contact-rich manipulation tasks. With an experiment on a peg-in-hole task, we demonstrate, to the best of our knowledge, the first DRL with stability guarantee on a real robotic manipulator.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2021
Keywords
Machine learning for robot control, reinforcement learning, Agricultural robots, Control theory, Industrial manipulators, Manipulators, Robotics, System stability, Control stability, Energy shaping control, Physical interactions, Robotic manipulation, Robotic manipulators, Stability analysis, Unconditional stability, Unknown environments, Deep learning
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-311752 (URN)10.1109/LRA.2021.3111962 (DOI)000701239400004 ()2-s2.0-85115187899 (Scopus ID)
Note

QC 20220504

Available from: 2022-05-04 Created: 2022-05-04 Last updated: 2025-02-09Bibliographically approved
Abdul Khader, S., Yin, H., Falco, P. & Kragic, D. (2021). Learning Stable Normalizing-Flow Control for Robotic Manipulation. In: 2021 IEEE International Conference On Robotics And Automation (ICRA 2021): . Paper presented at IEEE International Conference on Robotics and Automation (ICRA), MAY 30-JUN 05, 2021, Xian, PEOPLES R CHINA (pp. 1644-1650). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Learning Stable Normalizing-Flow Control for Robotic Manipulation
2021 (English)In: 2021 IEEE International Conference On Robotics And Automation (ICRA 2021), Institute of Electrical and Electronics Engineers (IEEE) , 2021, p. 1644-1650Conference paper, Published paper (Refereed)
Abstract [en]

Reinforcement Learning (RL) of robotic manipulation skills, despite its impressive successes, stands to benefit from incorporating domain knowledge from control theory. One of the most important properties that is of interest is control stability. Ideally, one would like to achieve stability guarantees while staying within the framework of state-of-the-art deep RL algorithms. Such a solution does not exist in general, especially one that scales to complex manipulation tasks. We contribute towards closing this gap by introducing normalizing-flow control structure, that can be deployed in any latest deep RL algorithms. While stable exploration is not guaranteed, our method is designed to ultimately produce deterministic controllers with provable stability. In addition to demonstrating our method on challenging contact-rich manipulation tasks, we also show that it is possible to achieve considerable exploration efficiency-reduced state space coverage and actuation efforts- without losing learning efficiency.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2021
Series
IEEE International Conference on Robotics and Automation ICRA, ISSN 1050-4729
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-311640 (URN)10.1109/ICRA48506.2021.9562071 (DOI)000765738801085 ()2-s2.0-85125487703 (Scopus ID)
Conference
IEEE International Conference on Robotics and Automation (ICRA), MAY 30-JUN 05, 2021, Xian, PEOPLES R CHINA
Note

Part of proceedings: ISBN 978-1-7281-9077-8

QC 20220502

Available from: 2022-05-02 Created: 2022-05-02 Last updated: 2025-02-09Bibliographically approved
Khader, S. A., Yin, H., Falco, P. & Kragic, D. (2021). Stability-Guaranteed Reinforcement Learning for Contact-Rich Manipulation. IEEE Robotics and Automation Letters, 6(1), 1-8
Open this publication in new window or tab >>Stability-Guaranteed Reinforcement Learning for Contact-Rich Manipulation
2021 (English)In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 6, no 1, p. 1-8Article in journal (Refereed) Published
Abstract [en]

Reinforcement learning (RL) has had its fair share of success in contact-rich manipulation tasks but it still lags behind in benefiting from advances in robot control theory such as impedance control and stability guarantees. Recently, the concept of variable impedance control (VIC) was adopted into RL with encouraging results. However, the more important issue of stability remains unaddressed. To clarify the challenge in stable RL, we introduce the term all-the-time-stability that unambiguously means that every possible rollout should be stability certified. Our contribution is a model-free RL method that not only adopts VIC but also achieves all-the-time-stability. Building on a recently proposed stable VIC controller as the policy parameterization, we introduce a novel policy search algorithm that is inspired by Cross-Entropy Method and inherently guarantees stability. Our experimental studies confirm the feasibility and usefulness of stability guarantee and also features, to the best of our knowledge, the first successful application of RL with all-the-time-stability on the benchmark problem of peg-in-hole.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2021
Keywords
Reinforcement learning, compliance and impedance control, compliant assembly
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-285758 (URN)10.1109/LRA.2020.3028529 (DOI)000577867400001 ()2-s2.0-85092014420 (Scopus ID)
Note

QC 20201112

Available from: 2020-11-12 Created: 2020-11-12 Last updated: 2025-02-07Bibliographically approved
Abdul Khader, S., Yin, H., Falco, P. & Kragic, D. (2020). Data-Efficient Model Learning and Prediction for Contact-Rich Manipulation Tasks. IEEE Robotics and Automation Letters, 5(3), 4321-4328
Open this publication in new window or tab >>Data-Efficient Model Learning and Prediction for Contact-Rich Manipulation Tasks
2020 (English)In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 5, no 3, p. 4321-4328Article in journal (Refereed) Published
Abstract [en]

In this letter, we investigate learning forward dynamics models and multi-step prediction of state variables (long-term prediction) for contact-rich manipulation. The problems are formulated in the context of model-based reinforcement learning (MBRL). We focus on two aspects-discontinuous dynamics and data-efficiency-both of which are important in the identified scope and pose significant challenges to State-of-the-Art methods. We contribute to closing this gap by proposing a method that explicitly adopts a specific hybrid structure for the model while leveraging the uncertainty representation and data-efficiency of Gaussian process. Our experiments on an illustrative moving block task and a 7-DOF robot demonstrate a clear advantage when compared to popular baselines in low data regimes.

Place, publisher, year, edition, pages
IEEE, 2020
Keywords
Model learning for control, contact modeling, reinforcement learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-278482 (URN)10.1109/LRA.2020.2996067 (DOI)000543200000006 ()2-s2.0-85085742764 (Scopus ID)
Note

QC 20200713

Available from: 2020-07-13 Created: 2020-07-13 Last updated: 2024-01-17Bibliographically approved
Abdul Khader, S., Yin, H., Falco, P. & Kragic, D.Learning Deep Neural Policies with Stability Guarantees.
Open this publication in new window or tab >>Learning Deep Neural Policies with Stability Guarantees
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Deep reinforcement learning (DRL) has been successfully used to solve various robotic manipulation tasks. However, most of the existing works do not address the issue of control stability. This is in sharp contrast to the control theory community where the well-established norm is to prove stability whenever a control law is synthesized. What makes traditional stability analysis difficult for DRL are the uninterpretable nature of the neural network policies and unknown system dynamics. In this work, unconditional stability is obtained by deriving an interpretable deep policy structure based on the energy shaping control of Lagrangian systems. Then, stability during physical interaction with an unknown environment is established based on passivity. The result is a stability guaranteeing DRL in a model-free framework that is general enough for contact-rich manipulation tasks. With an experiment on a peg-in-hole task, we demonstrate, to the best of our knowledge, the first DRL with stability guarantee on a real robotic manipulator.

Keywords
Robotics, Reinforcement Learning, Robot Control, Robotic Manipulation
National Category
Robotics and automation
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-299798 (URN)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20210818

Available from: 2021-08-17 Created: 2021-08-17 Last updated: 2025-02-09Bibliographically approved
Abdul Khader, S., Yin, H., Pietro, F. & Kragic, D.Learning Stable Normalizing-Flow Control for Robotic Manipulation.
Open this publication in new window or tab >>Learning Stable Normalizing-Flow Control for Robotic Manipulation
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Reinforcement Learning (RL) of robotic manipu-lation skills, despite its impressive successes, stands to benefitfrom incorporating domain knowledge from control theory. Oneof the most important properties that is of interest is controlstability. Ideally, one would like to achieve stability guaranteeswhile staying within the framework of state-of-the-art deepRL algorithms. Such a solution does not exist in general,especially one that scales to complex manipulation tasks. Wecontribute towards closing this gap by introducing normalizing-flow control structure, that can be deployed in any latest deepRL algorithms. While stable exploration is not guaranteed,our method is designed to ultimately produce deterministiccontrollers with provable stability. In addition to demonstratingour method on challenging contact-rich manipulation tasks, wealso show that it is possible to achieve considerable explorationefficiency–reduced state space coverage and actuation efforts–without losing learning efficiency.

Keywords
Robotics, Reinforcement Learning, Robot Control, Robotic Manipulation
National Category
Robotics and automation
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-299797 (URN)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20210818

Available from: 2021-08-17 Created: 2021-08-17 Last updated: 2025-02-09Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-0443-7982

Search in DiVA

Show all publications