kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Raising Body Ownership in End-to-End Visuomotor Policy Learning via Robot-Centric Pooling
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
Aalto University, Intelligent Robotics Group, Department of Electrical Engineering and Automation (EEA), Espoo, Finland.
KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Centre for Autonomous Systems, CAS.ORCID iD: 0000-0003-2965-2953
2024 (English)In: 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 7514-7520Conference paper, Published paper (Refereed)
Abstract [en]

We present Robot-centric Pooling (RcP), a novel pooling method designed to enhance end-to-end visuomo-tor policies by enabling differentiation between the robots and similar entities or their surroundings. Given an image-proprioception pair, RcP guides the aggregation of image features by highlighting image regions correlating with the robot's proprioceptive states, thereby extracting robot-centric image representations for policy learning. Leveraging contrastive learning techniques, RcP integrates seamlessly with existing visuomotor policy learning frameworks and is trained jointly with the policy using the same dataset, requiring no extra data collection involving self-distractors. We evaluate the proposed method with reaching tasks in both simulated and real-world settings. The results demonstrate that RcP significantly enhances the policies' robustness against various unseen distractors, including self-distractors, positioned at different locations. Additionally, the inherent robot-centric characteristic of RcP enables the learnt policy to be far more resilient to aggressive pixel shifts compared to the baselines. Code available at: https://github.com/Zheyu-Zhuang/RcP

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2024. p. 7514-7520
National Category
Computer graphics and computer vision Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-359872DOI: 10.1109/IROS58592.2024.10802462ISI: 001433985300002Scopus ID: 2-s2.0-85216503351OAI: oai:DiVA.org:kth-359872DiVA, id: diva2:1937181
Conference
2024 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2024, Abu Dhabi, United Arab Emirates, Oct 14 2024 - Oct 18 2024
Note

Part of ISBN 9798350377705]

QC 20250213

Available from: 2025-02-12 Created: 2025-02-12 Last updated: 2025-04-30Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Zhuang, ZheyuKragic, Danica

Search in DiVA

By author/editor
Zhuang, ZheyuKragic, Danica
By organisation
Robotics, Perception and Learning, RPLCentre for Autonomous Systems, CAS
Computer graphics and computer visionComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 23 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf