kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Latent Planning via Expansive Tree Search
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0002-1772-7930
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0003-1114-6040
Number of Authors: 22022 (English)In: Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022, Neural Information Processing Systems Foundation , 2022Conference paper, Published paper (Refereed)
Abstract [en]

Planning enables autonomous agents to solve complex decision-making problems by evaluating predictions of the future. However, classical planning algorithms often become infeasible in real-world settings where state spaces are high-dimensional and transition dynamics unknown. The idea behind latent planning is to simplify the decision-making task by mapping it to a lower-dimensional embedding space. Common latent planning strategies are based on trajectory optimization techniques such as shooting or collocation, which are prone to failure in long-horizon and highly non-convex settings. In this work, we study long-horizon goal-reaching scenarios from visual inputs and formulate latent planning as an explorative tree search. Inspired by classical sampling-based motion planning algorithms, we design a method which iteratively grows and optimizes a tree representation of visited areas of the latent space. To encourage fast exploration, the sampling of new states is biased towards sparsely represented regions within the estimated data support. Our method, called Expansive Latent Space Trees (ELAST), relies on self-supervised training via contrastive learning to obtain (a) a latent state representation and (b) a latent transition density model. We embed ELAST into a model-predictive control scheme and demonstrate significant performance improvements compared to existing baselines given challenging visual control tasks in simulation, including the navigation for a deformable object.

Place, publisher, year, edition, pages
Neural Information Processing Systems Foundation , 2022.
Series
Advances in Neural Information Processing Systems, ISSN 1049-5258 ; 35
National Category
Robotics and automation Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:kth:diva-331664Scopus ID: 2-s2.0-85163176952OAI: oai:DiVA.org:kth-331664DiVA, id: diva2:1782390
Conference
36th Conference on Neural Information Processing Systems, NeurIPS 2022, New Orleans, United States of America, Nov 28 2022 - Dec 9 2022
Note

Part of ISBN 9781713871088

QC 20230712

Available from: 2023-07-13 Created: 2023-07-13 Last updated: 2025-02-05Bibliographically approved
In thesis
1. Synergies between Policy Learning and Sampling-based Planning
Open this publication in new window or tab >>Synergies between Policy Learning and Sampling-based Planning
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
Synergier mellan policyinlärning och sampling-baserad planering
Abstract [en]

Recent advances in artificial intelligence and machine learning have significantly impacted the field of robotics and led to the interdisciplinary study of robot learning. These developments have the potential to revolutionize the automation of tasks in various industries by reducing the reliance on human workers. However, fully autonomous, learning-based robotic systems are still mainly limited to controlled environments. Ideally, we are looking for methods that enable autonomous acquisition of robotic skills for any temporally extended setting with potentially complex sensor observations. Classical sampling-based planning algorithms used in robot motion planning compute feasible paths between robot states over long time horizons and even in geometrically complex environments. This thesis investigates the possibility of combining learning-based methods with these classical approaches to solve challenging problems in robot manipulation, e.g. the manipulation of deformable objects. The core idea is to leverage the best of both worlds and achieve long-horizon control through planning, while using learning to obtain useful environment models from potentially high-dimensional and complex observation data. The presented frameworks rely on recent machine learning techniques such as contrastive representation learning, generative modeling and reinforcement learning. Finally, we outline the potentials, challenges and limitations of this type of approaches and highlight future directions.

Abstract [sv]

De senaste framstegen inom artificiell intelligens och maskininlärning har haft en betydande inverkan på robotikområdet och lett till det tvärvetenskapliga studerandet av robotinlärning. Dessa utvecklingar har potentialen att revolutionera automatiseringen inom olika industrier genom att minska beroendet av mänskliga arbetare. Dock är helt autonoma, inlärningsbaserade robotsystem fortfarande huvudsakligen begränsade till kontrollerade miljöer. Idealt sett letar vi efter metoder som möjliggör autonom förvärvning av robotfärdigheter för situationer med långa tidshorisonter och potentiellt komplexa sensorobservationer. Klassiska sampling-baserade planeringsalgoritmer som används i robotrörelseplanering beräknar genomförbara vägar mellan robottillstånd över långa tidshorisonter och även i geometriskt komplexa miljöer. I detta arbete undersöker vi möjligheten att kombinera inlärningsbaserade tillvägagångssätt med dessa klassiska tillvägagångssätt för att lösa utmanande problem inom robotmanipulation, t.ex. hantering av formbara objekt. Kärnidén är att utnyttja det bästa av båda världarna och uppnå långsiktig kontroll genom planering, samtidigt som man använder inlärning för att erhålla användbara miljömodeller från potentiellt högdimensionella och komplexa observationsdata. De presenterade ramverken förlitar sig på senaste maskininlärningstekniker såsom kontrastiv representationsinlärning, generativ modellering och förstärkningsinlärning. Slutligen skisserar vi potentialerna, utmaningarna och begränsningarna med denna typ av tillvägagångssätt och belyser framtida riktningar.

Place, publisher, year, edition, pages
Stockholm, Sweden: KTH Royal Institute of Technology, 2024. p. ix, 54
Series
TRITA-EECS-AVL ; 2024:6
Keywords
Machine Learning, Robotics, Reinforcement Learning, Motion Planning, Robotic Manipulation
National Category
Computer graphics and computer vision
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-341911 (URN)978-91-8040-803-5 (ISBN)
Public defence
2024-01-30, https://kth-se.zoom.us/j/63888939859, F3 (Flodis), Lindstedtsvägen 26 & 28, Stockholm, 15:00 (English)
Opponent
Supervisors
Note

QC 20240108

Available from: 2024-01-08 Created: 2024-01-05 Last updated: 2025-02-07Bibliographically approved

Open Access in DiVA

No full text in DiVA

Scopus

Authority records

Gieselmann, RobertPokorny, Florian T.

Search in DiVA

By author/editor
Gieselmann, RobertPokorny, Florian T.
By organisation
Robotics, Perception and Learning, RPL
Robotics and automationComputer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 72 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf