kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Learning Geometric Representations of Objects via Interaction
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Collaborative Autonomous Systems.ORCID iD: 0000-0001-8938-9363
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Collaborative Autonomous Systems.
University of Copenhagen, Copenhagen, Denmark.
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0002-0900-1523
Show others and affiliations
2023 (English)In: Machine Learning and Knowledge Discovery in Databases: Research Track - European Conference, ECML PKDD 2023, Proceedings, Springer Nature , 2023, p. 629-644Conference paper, Published paper (Refereed)
Abstract [en]

We address the problem of learning representations from observations of a scene involving an agent and an external object the agent interacts with. To this end, we propose a representation learning framework extracting the location in physical space of both the agent and the object from unstructured observations of arbitrary nature. Our framework relies on the actions performed by the agent as the only source of supervision, while assuming that the object is displaced by the agent via unknown dynamics. We provide a theoretical foundation and formally prove that an ideal learner is guaranteed to infer an isometric representation, disentangling the agent from the object and correctly extracting their locations. We evaluate empirically our framework on a variety of scenarios, showing that it outperforms vision-based approaches such as a state-of-the-art keypoint extractor. We moreover demonstrate how the extracted representations enable the agent to solve downstream tasks via reinforcement learning in an efficient manner.

Place, publisher, year, edition, pages
Springer Nature , 2023. p. 629-644
Keywords [en]
Equivariance, Interaction, Representation Learning
National Category
Computer graphics and computer vision Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-339271DOI: 10.1007/978-3-031-43421-1_37ISI: 001156141200037Scopus ID: 2-s2.0-85174436596OAI: oai:DiVA.org:kth-339271DiVA, id: diva2:1809749
Conference
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2023, Turin, Italy, Sep 18 2023 - Sep 22 2023
Note

Part of ISBN 9783031434204

QC 20231106

Available from: 2023-11-06 Created: 2023-11-06 Last updated: 2025-02-01Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Reichlin, AlfredoMarchetti, Giovanni LucaVarava, AnastasiiaKragic, Danica

Search in DiVA

By author/editor
Reichlin, AlfredoMarchetti, Giovanni LucaVarava, AnastasiiaKragic, Danica
By organisation
Collaborative Autonomous SystemsRobotics, Perception and Learning, RPL
Computer graphics and computer visionComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 99 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf