kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Geometric Multimodal Contrastive Representation Learning
KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Centre for Autonomous Systems, CAS.ORCID iD: 0000-0001-6920-5109
INESC-ID & Instituto Superior Técnico, University of Lisbon, Portugal.
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0002-3599-440x
INESC-ID & Instituto Superior Técnico, University of Lisbon, Portugal.
Show others and affiliations
Number of Authors: 62022 (English)In: Proceedings of the 39th International Conference on Machine Learning, ICML 2022, ML Research Press , 2022, p. 17782-17800Conference paper, Published paper (Refereed)
Abstract [en]

Learning representations of multimodal data that are both informative and robust to missing modalities at test time remains a challenging problem due to the inherent heterogeneity of data obtained from different channels. To address it, we present a novel Geometric Multimodal Contrastive (GMC) representation learning method consisting of two main components: i) a two-level architecture consisting of modality-specific base encoders, allowing to process an arbitrary number of modalities to an intermediate representation of fixed dimensionality, and a shared projection head, mapping the intermediate representations to a latent representation space; ii) a multimodal contrastive loss function that encourages the geometric alignment of the learned representations. We experimentally demonstrate that GMC representations are semantically rich and achieve state-of-the-art performance with missing modality information on three different learning problems including prediction and reinforcement learning tasks.

Place, publisher, year, edition, pages
ML Research Press , 2022. p. 17782-17800
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-333348ISI: 000900064907043Scopus ID: 2-s2.0-85153911322OAI: oai:DiVA.org:kth-333348DiVA, id: diva2:1784975
Conference
39th International Conference on Machine Learning, ICML 2022, Baltimore, United States of America, Jul 17 2022 - Jul 23 2022
Note

QC 20230801

Available from: 2023-08-01 Created: 2023-08-01 Last updated: 2023-08-14Bibliographically approved

Open Access in DiVA

No full text in DiVA

Scopus

Authority records

Poklukar, PetraYin, HangKragic, Danica

Search in DiVA

By author/editor
Poklukar, PetraYin, HangKragic, Danica
By organisation
Centre for Autonomous Systems, CASRobotics, Perception and Learning, RPL
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 52 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf