Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Authorship Attribution with Neural Networks: A study of the effects of sample size
KTH, School of Computer Science and Communication (CSC).
2015 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesisAlternative title
Författarigenkänning med Neurala Nätverk (Swedish)
Abstract [en]

Authorship attribution is a classification problem with the underlying assumption that each author has a unique, quantifiable writing style. It is therefore assumed that it is possible to determine the author of a particular text based on style markers in the text. Earlier authorship attribution problems focused on books and essays, but the focus has now shifted to short electronic texts such as emails and tweets, which brings up the issue of sample size. In this thesis we try to examine what effect the size of the samples has on classification accuracy by training a neural network on progressively smaller sample sizes. We find that with a sample size of 4500 words per sample, we achieve an accuracy of 98%. The accuracy drops to 70% for samples of 250 words per sample. These results underline the importanceof scalable style markers.

Place, publisher, year, edition, pages
2015.
National Category
Computer Engineering
Identifiers
URN: urn:nbn:se:kth:diva-173491OAI: oai:DiVA.org:kth-173491DiVA: diva2:853249
Supervisors
Examiners
Available from: 2017-12-28 Created: 2015-09-11 Last updated: 2018-01-11Bibliographically approved

Open Access in DiVA

No full text

By organisation
School of Computer Science and Communication (CSC)
Computer Engineering

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 3 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf