kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
On Label Noise in Image Classification: An Aleatoric Uncertainty Perspective
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0003-4535-2520
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Deep neural networks and large-scale datasets have revolutionized the field of machine learning. However, these large networks are susceptible to overfitting to label noise, resulting in generalization degradation. In response, the thesis closely examines the problem both from an empirical and theoretical perspective. We empirically analyse the input smoothness of networks as they overfit to label noise, and we theoretically explore the connection to aleatoric uncertainty. These analyses improve our understanding of the problem and have led to our novel methods aimed at enhancing robustness against label noise in classification.

Abstract [sv]

Djupa neurala nätverk och storskaliga dataset har revolutionerat maskininlärningsområdet. Dock är dessa stora nätverk känsliga för överanpassning till felmarkerade etiketter, vilket leder till försämrad generalisering. Som svar på detta undersöker avhandlingen noggrant problemet både från en empirisk och teoretisk synvinkel. Vi analyserar empiriskt nätverkens känslighet försmå ändringar i indatan när de över anpassar till felmarkerade etiketter, och vi utforskar teoretiskt kopplingen till aleatorisk osäkerhet. Dessa analyser förbättrar vår förståelse av problemet och har lett till våra nya metoder med syfte att vara robusta mot felmarkerade etiketter i klassificering.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2024. , p. xi, 68
Publication channel
978-91-8040-925-4
Series
TRITA-EECS-AVL ; 2024:45
Keywords [en]
Label noise, aleatoric uncertainty, noisy labels, robustness
Keywords [sv]
etikettbrus, osäkerhet, felmarkerade etiketter, robusthet
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-346453ISBN: 978-91-8040-925-4 (print)OAI: oai:DiVA.org:kth-346453DiVA, id: diva2:1858223
Public defence
2024-06-03, https://kth-se.zoom.us/w/61097277235, F3 (Flodis), Lindstedsvägen 26 & 28, Stockholm, 09:00 (English)
Opponent
Supervisors
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20240516

Available from: 2024-05-16 Created: 2024-05-16 Last updated: 2025-12-03Bibliographically approved
List of papers
1. Deep Double Descent via Smooth Interpolation
Open this publication in new window or tab >>Deep Double Descent via Smooth Interpolation
2023 (English)In: Transactions on Machine Learning Research, E-ISSN 2835-8856, Vol. 2023, no 4Article in journal (Refereed) Published
Abstract [en]

The ability of overparameterized deep networks to interpolate noisy data, while at the same time showing good generalization performance, has been recently characterized in terms of the double descent curve for the test error. Common intuition from polynomial regression suggests that overparameterized networks are able to sharply interpolate noisy data, without considerably deviating from the ground-truth signal, thus preserving generalization ability. At present, a precise characterization of the relationship between interpolation and generalization for deep networks is missing. In this work, we quantify sharpness of fit of the training data interpolated by neural network functions, by studying the loss landscape w.r.t. to the input variable locally to each training point, over volumes around cleanly- and noisily-labelled training samples, as we systematically increase the number of model parameters and training epochs. Our findings show that loss sharpness in the input space follows both model- and epoch-wise double descent, with worse peaks observed around noisy labels. While small interpolating models sharply fit both clean and noisy data, large interpolating models express a smooth loss landscape, where noisy targets are predicted over large volumes around training data points, in contrast to existing intuition.

Place, publisher, year, edition, pages
Transactions on Machine Learning Research (TMLR), 2023
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-346450 (URN)2-s2.0-86000152632 (Scopus ID)
Note

QC 20250320

Available from: 2024-05-15 Created: 2024-05-15 Last updated: 2025-03-20Bibliographically approved
2. Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels
Open this publication in new window or tab >>Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels
2021 (English)In: Proceedings 35th Conference on Neural Information Processing Systems (NeurIPS 2021)., NIPS , 2021Conference paper, Published paper (Refereed)
Abstract [en]

Prior works have found it beneficial to combine provably noise-robust loss functions e.g. mean absolute error (MAE) with standard categorical loss function e.g. crossentropy (CE) to improve their learnability. Here, we propose to use Jensen-Shannondivergence as a noise-robust loss function and show that it interestingly interpolatebetween CE and MAE with a controllable mixing parameter. Furthermore, wemake a crucial observation that CE exhibits lower consistency around noisy datapoints. Based on this observation, we adopt a generalized version of the Jensen-Shannon divergence for multiple distributions to encourage consistency arounddata points. Using this loss function, we show state-of-the-art results on bothsynthetic (CIFAR), and real-world (e.g. WebVision) noise with varying noise rates.

Place, publisher, year, edition, pages
NIPS, 2021
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-305931 (URN)000922928202022 ()2-s2.0-85124784124 (Scopus ID)
Conference
35th Conference on Neural Information Processing Systems, NeurIPS 2021, Virtual, Online, Dec 14 2021 - Dec 6 2021
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

Part of proceedings ISBN 978-171384539-3 

QC 20211220

Available from: 2021-12-09 Created: 2021-12-09 Last updated: 2025-02-07Bibliographically approved
3. Efficient Evaluation-Time Uncertainty Estimation by Improved Distillation
Open this publication in new window or tab >>Efficient Evaluation-Time Uncertainty Estimation by Improved Distillation
2019 (English)Conference paper, Poster (with or without abstract) (Refereed)
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-260511 (URN)
Conference
International Conference on Machine Learning (ICML) Workshops, 2019 Workshop on Uncertainty and Robustness in Deep Learning
Note

QC 20191001

Available from: 2019-09-30 Created: 2019-09-30 Last updated: 2025-02-07Bibliographically approved
4. Logistic-Normal Likelihoods for Heteroscedastic Label Noise
Open this publication in new window or tab >>Logistic-Normal Likelihoods for Heteroscedastic Label Noise
2023 (English)In: Transactions on Machine Learning Research, E-ISSN 2835-8856, Vol. 2023, no 8Article in journal (Refereed) Published
Abstract [en]

A natural way of estimating heteroscedastic label noise in regression is to model the observed (potentially noisy) target as a sample from a normal distribution, whose parameters can be learned by minimizing the negative log-likelihood. This formulation has desirable loss attenuation properties, as it reduces the contribution of high-error examples. Intuitively, this behavior can improve robustness against label noise by reducing overfitting. We propose an extension of this simple and probabilistic approach to classification that has the same desirable loss attenuation properties. Furthermore, we discuss and address some practical challenges of this extension. We evaluate the effectiveness of the method by measuring its robustness against label noise in classification. We perform enlightening experiments exploring the inner workings of the method, including sensitivity to hyperparameters, ablation studies, and other insightful analyses.

Place, publisher, year, edition, pages
Transactions on Machine Learning Research (TMLR), 2023
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-346451 (URN)2-s2.0-86000109470 (Scopus ID)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20250325

Available from: 2024-05-15 Created: 2024-05-15 Last updated: 2025-03-25Bibliographically approved
5. Robust Classification via Regression for Learning with Noisy Labels
Open this publication in new window or tab >>Robust Classification via Regression for Learning with Noisy Labels
2024 (English)In: Proceedings ICLR 2024 - The Twelfth International Conference on Learning Representations, 2024Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

Deep neural networks and large-scale datasets have revolutionized the field of machine learning. However, these large networks are susceptible to overfitting to label noise, resulting in reduced generalization. To address this challenge, two promising approaches have emerged: i) loss reweighting, which reduces the influence of noisy examples on the training loss, and ii) label correction that replaces noisy labels with estimated true labels. These directions have been pursued separately or combined as independent methods, lacking a unified approach. In this work, we present a unified method that seamlessly combines loss reweighting and label correction to enhance robustness against label noise in classification tasks. Specifically, by leveraging ideas from compositional data analysis in statistics, we frame the problem as a regression task, where loss reweighting and label correction can naturally be achieved with a shifted Gaussian label noise model. Our unified approach achieves strong performance compared to recent baselines on several noisy labelled datasets. We believe this work is a promising step towards robust deep learning in the presence of label noise. Our code is available at: https://github.com/ErikEnglesson/SGN.

Keywords
label noise, noisy labels, robustness, Gaussian noise, classification, log-ratio transform, compositional data analysis
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-346452 (URN)
Conference
ICLR 2024 - The Twelfth International Conference on Learning Representations, Messe Wien Exhibition and Congress Center, Vienna, Austria, May 7-11t, 2024 
Note

QC 20240515

Available from: 2024-05-15 Created: 2024-05-15 Last updated: 2024-05-16Bibliographically approved

Open Access in DiVA

summary(2863 kB)425 downloads
File information
File name SUMMARY01.pdfFile size 2863 kBChecksum SHA-512
011be9d630ebf13bfb4fba91a70a8621683c5c769c55d608442059a41cd947bd4bc76abc11d5f6357a80ebc2be0aee03c8f5db564fb3c01e51e771eb53464a39
Type fulltextMimetype application/pdf

Authority records

Englesson, Erik

Search in DiVA

By author/editor
Englesson, Erik
By organisation
Robotics, Perception and Learning, RPL
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 0 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 2247 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf