2526272829303128 of 204
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Predictive Uncertainty Estimates in Batch Normalized Neural Networks
KTH, School of Electrical Engineering and Computer Science (EECS).
2019 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Prediktiva osäkerhetsestimat i neurala nätverk tränade med batch-normalisering (Swedish)
Abstract [en]

Recent developments in Bayesian Learning have made the Bayesian view of parameter estimation applicable to a wider range of models, including Neural Networks. In particular, advancements in Approximate Inference have enabled the development of a number of techniques for performing approximate Bayesian Learning. One recent addition to these models is Monte Carlo Dropout (MCDO), a technique that only relies on Neural Networks being trained with Dropout and L2 weight regularization. This technique provides a practical approach to Bayesian Learning, enabling the estimation of valuable predictive distributions from many models already in use today. In recent years however, Batch Normalization has become the go to method to speed up training and improve generalization. This thesis shows that the MCDO technique can be applied to Neural Networks trained with Batch Normalization by a procedure called Monte Carlo Batch Normalization (MCBN) in this work. A quantitative evaluation of the quality of the predictive distributions for different models was performed on nine regression datasets. With no batch size optimization, MCBN is shown to outperform an identical model with constant predictive variance for seven datasets at the 0.05 significance level. Optimizing batch sizes for the remaining datasets resulted in MCBN outperforming the comparative models in one further case. An equivalent evaluation for MCDO showed that MCBN and MCDO yield similar results, suggesting that there is potential to adapt the MCDO technique to the more modern Neural Network architecture provided by Batch Normalization.

Abstract [sv]

Nya framsteg i bayesiansk modellering har möjliggjort användandet av ett bayesianskt synsätt på parameterestimering till ett större spann av modeller, inklusive neurala nätverk. Framsteg inom approximationstekniker har särskilt möjliggjort utvecklingen av flera tekniker för approximativ bayesiansk modellering. Ett nyligen föreslaget tillskott till dessa modeller är Monte Carlo Dropout (MCDO), en teknik som enbart kräver att neurala nätverk tränas med dropout och L2 viktregularisering. Denna teknik erbjuder en praktisk ansats till bayesiansk modellering, vilket möjliggör estimering av värdefulla prediktiva fördelningsfunktioner från många modeller som redan används idag. De senaste åren har dock batch-normalisering etablerats som standardtekniken för minskad träningstid och ökad generalisering. Detta examensarbete visar att MCDO-tekniken kan anpassas till neurala nätverk tränade med batch-normalisering genom en teknik som i detta arbete kallas Monte Carlo Batch Normalization (MCBN). En kvantitativ utvärdering av kvaliteten på estimaten av de prediktiva fördelningsfunktionerna för olika modeller utfördes på nio regressionsdataset. Utan optimering av batch-storlek visade MCBN bättre resultat än en identisk modell med konstant varians för sju dataset med signifikansnivå 0.05. En optimering av batch-storleken för de återstående dataseten resulterade i att MCBN överträffade jämförelsemodellen i ett ytterligare fall. En likvärdig undersökning för MCDO visade att MCBN och MCDO ger liknande resultat, vilket talar för att det finns potential i att anpassa MCDO-tekniken till den mer moderna arkitekturen för neurala nätverk som ges av batch-normalisering.

Place, publisher, year, edition, pages
2019. , p. 108
Series
TRITA-EECS-EX ; 2019:844
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-271214OAI: oai:DiVA.org:kth-271214DiVA, id: diva2:1416025
Educational program
Master of Science - Machine Learning
Supervisors
Examiners
Available from: 2020-03-20 Created: 2020-03-20 Last updated: 2020-03-20Bibliographically approved

Open Access in DiVA

fulltext(2223 kB)2 downloads
File information
File name FULLTEXT01.pdfFile size 2223 kBChecksum SHA-512
6c2ff83a2a9f38f31af36c98a3bf7d815c1f67782fd5723f43ca61609b2ceb6ce956853599a1370f8473b4462a1e4ba662e1cf0eb5c922f6e6998149a0f80b5c
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 2 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 12 hits
2526272829303128 of 204
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf