Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Style Mining of Electronic Messages for Multiple Authorship Discrimination: First Results
Department of Computer Science, Illinois Institute of Technology. (IIT Linguistic Cognition Laboratory)
Department of Computer Science, Illinois Institute of Technology. (IIT Linguistic Cognition Laboratory)
Department of Computer Science, Illinois Institute of Technology. (IIT Linguistic Cognition Laboratory)
2003 (English)In: KDD '03 Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining / [ed] Getoor et al., New York, NY, USA: ACM , 2003, 475-480 p.Conference paper, Published paper (Refereed)
Abstract [en]

This paper considers the use of computational stylistics for performing authorship attribution of electronic messages, addressing categorization problems with as many as 20 different classes (authors). Effective stylistic characterization of text is potentially useful for a variety of tasks, as language style contains cues regarding the authorship, purpose, and mood of the text, all of which would be useful adjuncts to information retrieval or knowledge-management tasks. We focus here on the problem of determining the author of an anonymous message, based only on the message text. Several multiclass variants of the Winnow algorithm were applied to a vector representation of the message texts to learn models for discriminating different authors. We present results comparing the classification accuracy of the different approaches. The results show that stylistic models can be accurately learned to determine an author's identity.

Place, publisher, year, edition, pages
New York, NY, USA: ACM , 2003. 475-480 p.
Keyword [en]
authorship attribution, computational stylistics, electronic communication, text categorization, text mining
National Category
Computer and Information Science
Identifiers
URN: urn:nbn:se:kth:diva-38215DOI: 10.1145/956750.956805ISBN: 1-58113-737-0 (print)OAI: oai:DiVA.org:kth-38215DiVA: diva2:436263
Conference
The Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Available from: 2011-08-22 Created: 2011-08-22 Last updated: 2011-08-24Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Saric, Marin
Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 39 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf