kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 14) Show all publications
Medbouhi, A. A., Marchetti, G. L., Polianskii, V., Kravberg, A., Poklukar, P., Varava, A. & Kragic, D. (2024). Hyperbolic Delaunay Geometric Alignment. In: Bifet, A Krilavicius, T Davis, J Kull, M Ntoutsi, E Zliobaite, I (Ed.), Machine learning and knowledge discovery in databases: Research track, pt iii, ECML PKDD 2024. Paper presented at Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD), SEP 09-13, 2024, Vilnius, Lithuania (pp. 111-126). Springer Nature
Open this publication in new window or tab >>Hyperbolic Delaunay Geometric Alignment
Show others...
2024 (English)In: Machine learning and knowledge discovery in databases: Research track, pt iii, ECML PKDD 2024 / [ed] Bifet, A Krilavicius, T Davis, J Kull, M Ntoutsi, E Zliobaite, I, Springer Nature , 2024, p. 111-126Conference paper, Published paper (Refereed)
Abstract [en]

Hyperbolic machine learning is an emerging field aimed at representing data with a hierarchical structure. However, there is a lack of tools for evaluation and analysis of the resulting hyperbolic data representations. To this end, we propose Hyperbolic Delaunay Geometric Alignment (HyperDGA) - a similarity score for comparing datasets in a hyperbolic space. The core idea is counting the edges of the hyperbolic Delaunay graph connecting datapoints across the given sets. We provide an empirical investigation on synthetic and real-life biological data and demonstrate that HyperDGA outperforms the hyperbolic version of classical distances between sets. Furthermore, we showcase the potential of HyperDGA for evaluating latent representations inferred by a Hyperbolic Variational Auto-Encoder.

Place, publisher, year, edition, pages
Springer Nature, 2024
Series
Lecture Notes in Artificial Intelligence, ISSN 2945-9133 ; 14943
Keywords
Hyperbolic Geometry, Hierarchical Data, Evaluation
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-355149 (URN)10.1007/978-3-031-70352-2_7 (DOI)001308375900007 ()
Conference
Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD), SEP 09-13, 2024, Vilnius, Lithuania
Note

Part of ISBN: 978-3-031-70351-5, 978-3-031-70352-2

QC 20241025

Available from: 2024-10-25 Created: 2024-10-25 Last updated: 2024-10-25Bibliographically approved
Lippi, M., Poklukar, P., Welle, M. C., Varava, A., Yin, H., Marino, A. & Kragic, D. (2023). Enabling Visual Action Planning for Object Manipulation Through Latent Space Roadmap. IEEE Transactions on robotics, 39(1), 57-75
Open this publication in new window or tab >>Enabling Visual Action Planning for Object Manipulation Through Latent Space Roadmap
Show others...
2023 (English)In: IEEE Transactions on robotics, ISSN 1552-3098, E-ISSN 1941-0468, Vol. 39, no 1, p. 57-75Article in journal (Refereed) Published
Abstract [en]

In this article, we present a framework for visual action planning of complex manipulation tasks with high-dimensional state spaces, focusing on manipulation of deformable objects. We propose a latent space roadmap (LSR) for task planning, which is a graph-based structure globally capturing the system dynamics in a low-dimensional latent space. Our framework consists of the following three parts. First, a mapping module (MM) that maps observations is given in the form of images into a structured latent space extracting the respective states as well as generates observations from the latent states. Second, the LSR, which builds and connects clusters containing similar states in order to find the latent plans between start and goal states, extracted by MM. Third, the action proposal module that complements the latent plan found by the LSR with the corresponding actions. We present a thorough investigation of our framework on simulated box stacking and rope/box manipulation tasks, and a folding task executed on a real robot. 

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
Keywords
Deep Learning in Robotics and Automation, Latent Space Planning, Manipulation Planning, Visual Learning, Deep learning, Graphic methods, Job analysis, Planning, Robot programming, Action planning, Deep learning in robotic and automation, Heuristics algorithm, Roadmap, Space planning, Stackings, Task analysis, Heuristic algorithms
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-326180 (URN)10.1109/TRO.2022.3188163 (DOI)000829072000001 ()2-s2.0-85135223386 (Scopus ID)
Note

QC 20230502

Available from: 2023-05-02 Created: 2023-05-02 Last updated: 2025-02-09Bibliographically approved
Lippi, M., Welle, M. C., Poklukar, P., Marino, A. & Kragic, D. (2022). Augment-Connect-Explore: a Paradigm for Visual Action Planning with Data Scarcity. In: 2022 IEEE/RSJ international conference on intelligent robots and systems (IROS): . Paper presented at IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), OCT 23-27, 2022, Kyoto, JAPAN (pp. 754-761). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Augment-Connect-Explore: a Paradigm for Visual Action Planning with Data Scarcity
Show others...
2022 (English)In: 2022 IEEE/RSJ international conference on intelligent robots and systems (IROS), Institute of Electrical and Electronics Engineers (IEEE) , 2022, p. 754-761Conference paper, Published paper (Refereed)
Abstract [en]

Visual action planning particularly excels in applications where the state of the system cannot be computed explicitly, such as manipulation of deformable objects, as it enables planning directly from raw images. Even though the field has been significantly accelerated by deep learning techniques, a crucial requirement for their success is the availability of a large amount of data. In this work, we propose the Augment-ConnectExplore (ACE) paradigm to enable visual action planning in cases of data scarcity. We build upon the Latent Space Roadmap (LSR) framework which performs planning with a graph built in a low dimensional latent space. In particular, ACE is used to i) Augment the available training dataset by autonomously creating new pairs of datapoints, ii) create new unobserved Connections among representations of states in the latent graph, and iii) Explore new regions of the latent space in a targeted manner. We validate the proposed approach on both simulated box stacking and real-world folding task showing the applicability for rigid and deformable object manipulation tasks, respectively.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2022
Series
IEEE International Conference on Intelligent Robots and Systems, ISSN 2153-0858
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-325007 (URN)10.1109/IROS47612.2022.9982199 (DOI)000908368200076 ()2-s2.0-85146343603 (Scopus ID)
Conference
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), OCT 23-27, 2022, Kyoto, JAPAN
Note

QC 20230324

Available from: 2023-03-24 Created: 2023-03-24 Last updated: 2025-02-09Bibliographically approved
Poklukar, P., Polianskii, V., Varava, A., Pokorny, F. T. & Kragic, D. (2022). Delaunay Component Analysis for Evaluation of Data Representations. In: Proceedings 10th International Conference on Learning Representations, ICLR 2022: . Paper presented at 10th International Conference on Learning Representations, ICLR 2022, Apr 25-29, 2022 (online). International Conference on Learning Representations, ICLR
Open this publication in new window or tab >>Delaunay Component Analysis for Evaluation of Data Representations
Show others...
2022 (English)In: Proceedings 10th International Conference on Learning Representations, ICLR 2022, International Conference on Learning Representations, ICLR , 2022Conference paper, Published paper (Refereed)
Abstract [en]

Advanced representation learning techniques require reliable and general evaluation methods. Recently, several algorithms based on the common idea of geometric and topological analysis of a manifold approximated from the learned data representations have been proposed. In this work, we introduce Delaunay Component Analysis (DCA) - an evaluation algorithm which approximates the data manifold using a more suitable neighbourhood graph called Delaunay graph. This provides a reliable manifold estimation even for challenging geometric arrangements of representations such as clusters with varying shape and density as well as outliers, which is where existing methods often fail. Furthermore, we exploit the nature of Delaunay graphs and introduce a framework for assessing the quality of individual novel data representations. We experimentally validate the proposed DCA method on representations obtained from neural networks trained with contrastive objective, supervised and generative models, and demonstrate various use cases of our extended single point evaluation framework.

Place, publisher, year, edition, pages
International Conference on Learning Representations, ICLR, 2022
Keywords
Representation Learning, Machine Learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-312715 (URN)2-s2.0-85124640294 (Scopus ID)
Conference
10th International Conference on Learning Representations, ICLR 2022, Apr 25-29, 2022 (online)
Note

QC 20220614

Available from: 2022-05-20 Created: 2022-05-20 Last updated: 2023-09-07Bibliographically approved
Poklukar, P., Vasco, M., Yin, H., Melo, F. S., Paiva, A. & Kragic, D. (2022). Geometric Multimodal Contrastive Representation Learning. In: Proceedings of the 39th International Conference on Machine Learning, ICML 2022: . Paper presented at 39th International Conference on Machine Learning, ICML 2022, Baltimore, United States of America, Jul 17 2022 - Jul 23 2022 (pp. 17782-17800). ML Research Press
Open this publication in new window or tab >>Geometric Multimodal Contrastive Representation Learning
Show others...
2022 (English)In: Proceedings of the 39th International Conference on Machine Learning, ICML 2022, ML Research Press , 2022, p. 17782-17800Conference paper, Published paper (Refereed)
Abstract [en]

Learning representations of multimodal data that are both informative and robust to missing modalities at test time remains a challenging problem due to the inherent heterogeneity of data obtained from different channels. To address it, we present a novel Geometric Multimodal Contrastive (GMC) representation learning method consisting of two main components: i) a two-level architecture consisting of modality-specific base encoders, allowing to process an arbitrary number of modalities to an intermediate representation of fixed dimensionality, and a shared projection head, mapping the intermediate representations to a latent representation space; ii) a multimodal contrastive loss function that encourages the geometric alignment of the learned representations. We experimentally demonstrate that GMC representations are semantically rich and achieve state-of-the-art performance with missing modality information on three different learning problems including prediction and reinforcement learning tasks.

Place, publisher, year, edition, pages
ML Research Press, 2022
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-333348 (URN)000900064907043 ()2-s2.0-85153911322 (Scopus ID)
Conference
39th International Conference on Machine Learning, ICML 2022, Baltimore, United States of America, Jul 17 2022 - Jul 23 2022
Note

QC 20230801

Available from: 2023-08-01 Created: 2023-08-01 Last updated: 2023-08-14Bibliographically approved
Poklukar, P., Miguel, V., Yin, H., Melo, F. S., Paiva, A. & Kragic, D. (2022). GMC - Geometric Multimodal Contrastive Representation Learning. In: : . Paper presented at International Conference on Machine Learning.
Open this publication in new window or tab >>GMC - Geometric Multimodal Contrastive Representation Learning
Show others...
2022 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Learning representations of multimodal data that are both informative and robust to missing modalities at test time remains a challenging problem due to the inherent heterogeneity of data obtained from different channels. To address it, we present a novel Geometric Multimodal Contrastive (GMC) representation learning method comprised of two main components: i) a two level architecture consisting of modality-specific base encoder, allowing to process an arbitrary number of modalities to an intermediate representation of fixed dimensionality, and a shared projection head, mapping the intermediate representations to a latent representation space; ii) a multimodal contrastive loss function that encourages the geometric alignment of the learned representations. We experimentally demonstrate that GMC representations are semantically rich and achieve state-of-the-art performance with missing modality information on three different learning problems including prediction and reinforcement learning tasks.

Keywords
Representation Learning, Machine Learning, Multimodal, Contrastive Learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-312719 (URN)
Conference
International Conference on Machine Learning
Note

QC 20220614

Available from: 2022-05-20 Created: 2022-05-20 Last updated: 2022-06-25Bibliographically approved
Poklukar, P. (2022). Learning and Evaluating the Geometric Structure of Representation Spaces. (Doctoral dissertation). KTH Royal Institute of Technology
Open this publication in new window or tab >>Learning and Evaluating the Geometric Structure of Representation Spaces
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Efficient representations of observed input data have been shown to significantly accelerate the performance of subsequent learning tasks in numerous domains. To obtain such representations automatically, we need to design both i) models that identify useful patterns in the input data and encode them into structured low dimensional representations, and ii) evaluation measures that accurately assess the quality of the resulting representations. In this thesis, we present work that addresses both these requirements, where we extensively focus on requirement ii) since the evaluation of representations has been largely unexplored in the machine learning research. We begin with an overview of representation learning techniques and different structures that can be imposed on representation spaces, thus first addressing i). In this regard,we present a representation learning model that identifies useful patterns from multimodal data, and describe an approach that promotes a structure on there presentation space that is favourable for performing a robotics task. We then thoroughly study the problem of assessing the quality of learned representations and overview the pitfalls of current practices. With this, we motivate the evaluation based on analyzing geometric properties of representations and present two novel evaluation algorithms constituting the core of this thesis. Finally, we present an application of the proposed evaluation algorithms to compare large input graphs.

Abstract [sv]

Effektive representationer av observerat input-data har visat sig ge ensignifikant ökning av prestandan för träningsproblem i ett flertal områden.För att på ett automatiskt sett få fram sådana representationer behövervi både i) modeller som kan identifiera användbara mönster i input-datatoch koda dessa till strukturerade lågdimensionella representationer, samtii) utvärderingsmått som på ett tillförlitligt sätt mäter kvaliteten av dessarepresentationer. I denna avhandling presenterar vi arbete som hanterar bådadessa krav, där fokus ligger på ii) eftersom utvärdering av representationerhar varit ett i stort sätt outforskat ämne i litteraturen för maskininlärning.Vi börjar med en översikt av representationsinlärningstekniker och typer avstrukturer som man kan förelägga på representationsrymden, vilket tillhöri). I detta avseende, presenterar vi modell för representationsinlärning somidentifierar användbara mönster från multimodal data, samt beskriver enmetod som framhäver struktur på representationsrymden som gör sig välpassande för robotikuppgift. Vi studerar sedan genomgående problemet medatt avgöra kvaliteten av dessa inlärda representationer och ger en översikt avvanliga fallgropar som finns med nuvarande metoder. Vi motiverar med dettautvärderingen baserat på av representationernas geometriska egenskaper ochpresenterar två nya utvärderingsalgoritmer vilka huvuddelen av avhandlingenbestår av. Slutligen så presenterar vi ett praktiskt användningsområde avalgoritmerna för att jämföra stora inputgrafer.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2022. p. 54
Series
TRITA-EECS-AVL ; 2022:33
Keywords
Representation Learning, Machine Learning, Generative Models
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-312723 (URN)978-91-8040-228-6 (ISBN)
Public defence
2022-06-13, https://kth-se.zoom.us/j/65953366981, F3, Lindstedtsvägen 26, Stockholm, 15:00 (English)
Opponent
Supervisors
Note

QC 20220523

Available from: 2022-05-23 Created: 2022-05-20 Last updated: 2022-06-25Bibliographically approved
Ghadirzadeh, A., Poklukar, P., Arndt, K., Finn, C., Kyrki, V., Kragic, D. & Björkman, M. (2022). Training and Evaluation of Deep Policies Using Reinforcement Learning and Generative Models. Journal of machine learning research, 23
Open this publication in new window or tab >>Training and Evaluation of Deep Policies Using Reinforcement Learning and Generative Models
Show others...
2022 (English)In: Journal of machine learning research, ISSN 1532-4435, E-ISSN 1533-7928, Vol. 23Article in journal (Refereed) Published
Abstract [en]

We present a data-efficient framework for solving sequential decision-making problems which exploits the combination of reinforcement learning (RL) and latent variable generative models. The framework, called GenRL, trains deep policies by introducing an action latent variable such that the feed-forward policy search can be divided into two parts: (i) training a sub-policy that outputs a distribution over the action latent variable given a state of the system, and (ii) unsupervised training of a generative model that outputs a sequence of motor actions conditioned on the latent action variable. GenRL enables safe exploration and alleviates the data-inefficiency problem as it exploits prior knowledge about valid sequences of motor actions. Moreover, we provide a set of measures for evaluation of generative models such that we are able to predict the performance of the RL policy training prior to the actual training on a physical robot. We experimentally determine the characteristics of generative models that have most influence on the performance of the final policy training on two robotics tasks: shooting a hockey puck and throwing a basketball. Furthermore, we empirically demonstrate that GenRL is the only method which can safely and efficiently solve the robotics tasks compared to two state-of-the-art RL methods.

Place, publisher, year, edition, pages
Microtome Publishing, 2022
Keywords
deep generative models, policy search, reinforcement learning, representation learning, robot learning
National Category
Robotics and automation Computer Sciences
Identifiers
urn:nbn:se:kth:diva-330021 (URN)10.5555/3586589.3586763 (DOI)001003314000001 ()2-s2.0-85147734399 (Scopus ID)
Note

QC 20230628

Available from: 2023-06-29 Created: 2023-06-29 Last updated: 2025-02-05Bibliographically approved
Ghadirzadeh, A., Chen, X., Poklukar, P., Finn, C., Björkman, M. & Kragic, D. (2021). Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic Platforms. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS): . Paper presented at IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), SEP 27-OCT 01, 2021, ELECTR NETWORK, Prague 27 September 2021 through 1 October 2021 (pp. 1274-1280). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic Platforms
Show others...
2021 (English)In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Institute of Electrical and Electronics Engineers (IEEE) , 2021, p. 1274-1280Conference paper, Published paper (Refereed)
Abstract [en]

Reinforcement learning methods can achieve significant performance but require a large amount of training data collected on the same robotic platform. A policy trained with expensive data is rendered useless after making even a minor change to the robot hardware. In this paper, we address the challenging problem of adapting a policy, trained to perform a task, to a novel robotic hardware platform given only few demonstrations of robot motion trajectories on the target robot. We formulate it as a few-shot meta-learning problem where the goal is to find a meta-model that captures the common structure shared across different robotic platforms such that data-efficient adaptation can be performed. We achieve such adaptation by introducing a learning framework consisting of a probabilistic gradient-based meta-learning algorithm that models the uncertainty arising from the few-shot setting with a low-dimensional latent variable. We experimentally evaluate our framework on a simulated reaching and a real-robot picking task using 400 simulated robots generated by varying the physical parameters of an existing set of robotic platforms. Our results show that the proposed method can successfully adapt a trained policy to different robotic platforms with novel physical parameters and the superiority of our meta-learning algorithm compared to state-of-the-art methods for the introduced few-shot policy adaptation problem.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2021
Series
IEEE International Conference on Intelligent Robots and Systems, ISSN 2153-0858
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-310042 (URN)10.1109/IROS51168.2021.9636628 (DOI)000755125501008 ()2-s2.0-85124371197 (Scopus ID)
Conference
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), SEP 27-OCT 01, 2021, ELECTR NETWORK, Prague 27 September 2021 through 1 October 2021
Note

QC 20220322

Part of proceedings: ISBN 978-166541714-3

Available from: 2022-03-22 Created: 2022-03-22 Last updated: 2025-02-09Bibliographically approved
Poklukar, P., Varava, A. & Kragic, D. (2021). GeomCA: Geometric Evaluation of Data Representations. In: Proceedings of Machine Learning Research: Proceedings of the 38th International Conference on Machine Learning. Paper presented at 38th International Conference on Machine Learning, ICML 2021, Virtual Online, 18-24 July 2021 (pp. 8588-8598). ML Research Press
Open this publication in new window or tab >>GeomCA: Geometric Evaluation of Data Representations
2021 (English)In: Proceedings of Machine Learning Research: Proceedings of the 38th International Conference on Machine Learning, ML Research Press , 2021, p. 8588-8598Conference paper, Published paper (Refereed)
Abstract [en]

Evaluating the quality of learned representations without relying on a downstream task remains one of the challenges in representation learning. In this work, we present Geometric Component Analysis (GeomCA) algorithm that evaluates representation spaces based on their geometric and topological properties. GeomCA can be applied to representations of any dimension, independently of the model that generated them. We demonstrate its applicability by analyzing representations obtained from a variety of scenarios, such as contrastive learning models, generative models and supervised learning models.

Place, publisher, year, edition, pages
ML Research Press, 2021
Keywords
Embedding and Representation learning, Algorithms Evaluation, Generative Models, GeomCA
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-308496 (URN)2-s2.0-85124639338 (Scopus ID)
Conference
38th International Conference on Machine Learning, ICML 2021, Virtual Online, 18-24 July 2021
Note

QC 20220215

Available from: 2022-02-08 Created: 2022-02-08 Last updated: 2024-07-12Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-6920-5109

Search in DiVA

Show all publications