kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Machine learning and Neural networks in Fake news detection: A mapping study
KTH, School of Electrical Engineering and Computer Science (EECS).
KTH, School of Electrical Engineering and Computer Science (EECS).
2022 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesisAlternative title
Maskininlärning och neurala nätverk inom fake news-detektion : En kartläggning (Swedish)
Abstract [en]

Fake news, or information disorder, is a societal problem that could be partially remedied by automatic detection tools. While still a young research field many such tools have been proposed in academic writing. This systematic mapping study gives an overview of the current research in Natural Language Process-based fake news detection utilising Machine Learning and Neural Network classification algorithms in regards to which classification algorithms have been studied and which datasets have been used. Furthermore, we attempt to make a generalised description of the performance (measured in f-score and accuracy) of the most commonly occurring classification algorithms. From a corpus of 124 research articles and other scientific texts we identify 63 different datasets mainly written in English, and 116 different classification algorithms. The seven most commonly occurring algorithms (Random Forest, Logistic Regression, Support Vector Machine, Decision Tree, Long Short- TermMemory, K-Nearest Neighbors, Convolutional Neural Network) together make up almost 50% of all algorithm occurences in the article corpus. For these seven, the ten occurrences with the best performance are listed. Out of the datasets, the six most common datasets (ISOT, FakeNewsNet, Patwa 2021, LIAR, Bisaillon, and UTK-MLC) together make up 44% of all dataset occurrences. Apart from English, the represented languages were mainly Chinese (Mandarin), Portugese, Indonesian, Bangla, and Albanian. 

Abstract [sv]

Olika typer av desinformation (så kallade fake news), är ett problem för dagens samhälle. En av flera möjliga dellösningar på problemet utgörs av automatiserad fake news-detektion. Trots att detta forskningsfält är relativt nytt finns det en uppsjö av olika föreslagna modeller för automatiserad fake news-detektion. Denna systematiska kartläggning syftar till att ge en överblick över den aktuella forskningen inom Natural Language Processing-baserad automatiserad fake news-detektion med klassifikationsalgoritmer både inom maskininlärning och neurala nätverk. Översikten avser vilka klassifikationsalgoritmer samt vilka dataset som förekommer inom forskningen. Vidare försöker vi göra en generell beskrivning av prestandan hos de vanligast förekommande klassifikationsalgoritmerna, mätt i accuracy och f-score. Kartläggningen omfattar en samling på 124 artiklar och andra vetenskapliga texter, ur vilka vi identifierade 63 förekommance dataset och 116 olika förekommande klassifikationsalgoritmer. De sju vanligast förekommande algoritmerna (Random Forest, Logistic Regression, Support Vector Machine, Decision Tree, Long-Short Memory Network, K-Nearest Neighbors, Convolutional Neural Network) utgör tillsammans 49% av alla förekomster inom artikelsamlingen. Vi har tagit ut santliga förekomster av prestandaresultat för dessa sju algoritmer, och listat de tio bästa prestandaresultaten för var och en av de sju algoritmerna. De sex vanligast förekommande dataseten (ISOT, FakeNewsNet, Patwa 2021, LIAR, Bisaillon, and UTK-MLC) utgör tillsammans 44% av alla förekomster. Engelska var med stor marginal det vanligast förekommande språket inom dataseten, andra språk som förekom var kinesiska (mandarin), portugisiska, indonesiska, bangla, och albanska.

Place, publisher, year, edition, pages
2022. , p. 84
Series
TRITA-EECS-EX ; 2022:205
Keywords [en]
Fake news, Information disorder, Detection, Disinformation, Natural Lan-guage Processing, Machine Learning, Neural Networks
Keywords [sv]
Fake news, Desinformation, Detektion, Natural Language Processing, Maskininlärning, Neurala Nätver
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-318413OAI: oai:DiVA.org:kth-318413DiVA, id: diva2:1697557
Subject / course
Computer Science
Educational program
Bachelor of Science in Engineering - Computer Engineering
Supervisors
Examiners
Available from: 2022-09-22 Created: 2022-09-21 Last updated: 2022-09-22Bibliographically approved

Open Access in DiVA

fulltext(960 kB)2122 downloads
File information
File name FULLTEXT01.pdfFile size 960 kBChecksum SHA-512
26e82274e1bdb6dbe04fc8c78c3ac934a8fa3ae9b6e6ffc5f526cf14be249dd9cbbe118306429c6cd2f1252ed15abc0fb514a1ca7ca9961aeb54cf4d40a91236
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 2122 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 1181 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf