kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Neonatal Sepsis Detection With Random Forest Classification for Heavily Imbalanced Data
KTH, School of Electrical Engineering and Computer Science (EECS).
2022 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Neonatal sepsis is associated with most cases ofmortality in the neonatal intensive care unit. Major challengesin detecting sepsis using suitable biomarkers has lead people tolook for alternative approaches in the form of Machine Learningtechniques. In this project, Random Forest classification wasperformed on a sepsis data set provided by Karolinska Hospital.We particularly focused on tackling class imbalance in the datausing sampling and cost-sensitive techniques. We compare theclassification performances of Random Forests in six differentsetups; four using oversampling and undersampling techniques;one using cost-sensitive learning and one basic Random Forest.The performance with the oversampling techniques were betterand could identify more sepsis patients than the other setups.The overall performances were also good, making the methodspotentially useful in practice.

Abstract [sv]

Neonatal sepsis är orsaken till majoriteten av mortaliteten i neonatal intensivvården. Svårigheten i att detektera sepsis med hjälp av biomarkörer har lett många att leta efter alternativa metoder. Maskininlärningstekniker är en sådan alternativ metod som har i senaste tider ökat i användning inom vård och andra sektorer. I detta project användes Random Forest klassifikations algoritmen på en sepsis datamängd given av Karolinska Sjukhuset. Vi fokuserade på att hantera klassimbalansen i datan genom att använda olika provtagningsoch kostnadskänsliga metoder. Vi jämförde klassificeringsprestanda för Random Forest med sex olika inställningar; fyra av de använde provtagingsmetoderna; en av de använde en kostnadskänslig metod och en var en vanlig Random Forest. Det visade sig att modellens prestanda ökade som mest med översamplings metoderna. Den generella klassificeringsprestandan var också bra, vilket gör Random Forests tillsammans med ingsmetoderna potentiellt användbar i praktiken.

Place, publisher, year, edition, pages
2022. , p. 615-619
Series
TRITA-EECS-EX ; 2022:176
Keywords [en]
Random Forest, Neonatal Sepsis, Imbalanced Classification, Cost-sensitive, SMOTE, ADASYN, CNN, Tomek- Links
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:kth:diva-323733OAI: oai:DiVA.org:kth-323733DiVA, id: diva2:1736058
Supervisors
Examiners
Projects
Kandidatexjobb i elektroteknik 2022, KTH, StockholmAvailable from: 2023-02-10 Created: 2023-02-10

Open Access in DiVA

fulltext(146281 kB)309 downloads
File information
File name FULLTEXT01.pdfFile size 146281 kBChecksum SHA-512
6ef8ac5f57bfa731be6b63752e35a8ad576eee3fa90434c6241186b62dff01f689b0a22454b6d600dd2dfa06cfe9b879bd9a8193673765e6a41ac9b51260faf0
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 309 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 232 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf