kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Information Science and Engineering. Apple.ORCID iD: 0000-0002-0862-1333
Apple.
Apple.
Apple.
Show others and affiliations
2023 (English)In: Proceedings of the 40th International Conference on Machine Learning, ICML 2023, ML Research Press , 2023, p. 29143-29160Conference paper, Published paper (Refereed)
Abstract [en]

The mechanisms behind the success of multi-view self-supervised learning (MVSSL) are not yet fully understood. Contrastive MVSSL methods have been studied through the lens of InfoNCE, a lower bound of the Mutual Information (MI). However, the relation between other MVSSL methods and MI remains unclear. We consider a different lower bound on the MI consisting of an entropy and a reconstruction term (ER), and analyze the main MVSSL families through its lens. Through this ER bound, we show that clustering-based methods such as DeepCluster and SwAV maximize the MI. We also re-interpret the mechanisms of distillation-based approaches such as BYOL and DINO, showing that they explicitly maximize the reconstruction term and implicitly encourage a stable entropy, and we confirm this empirically. We show that replacing the objectives of common MVSSL methods with this ER bound achieves competitive performance, while making them stable when training with smaller batch sizes or smaller exponential moving average (EMA) coefficients.

Place, publisher, year, edition, pages
ML Research Press , 2023. p. 29143-29160
National Category
Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:kth:diva-350170Scopus ID: 2-s2.0-85174395730OAI: oai:DiVA.org:kth-350170DiVA, id: diva2:1883223
Conference
40th International Conference on Machine Learning, ICML 2023, Honolulu, United States of America, Jul 23 2023 - Jul 29 2023
Note

QC 20240709

Available from: 2024-07-09 Created: 2024-07-09 Last updated: 2025-02-07Bibliographically approved

Open Access in DiVA

No full text in DiVA

Scopus

Authority records

Rodríguez Gálvez, Borja

Search in DiVA

By author/editor
Rodríguez Gálvez, Borja
By organisation
Information Science and Engineering
Computer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 35 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf