Change search
ReferencesLink to record
Permanent link

Direct link
Inferring the location of authors from words in their texts
KTH, School of Computer Science and Communication (CSC), Theoretical Computer Science, TCS.ORCID iD: 0000-0003-4042-4919
Stockholms universitet.
Stockholms universitet.
2015 (English)In: Proceedings of the 20th Nordic Conference of Computational Linguistics, Linköping University Electronic Press, 2015Conference paper (Refereed)
Abstract [en]

For the purposes of computational dialec- tology or other geographically bound text analysis tasks, texts must be annotated with their or their authors’ location. Many texts are locatable but most have no ex- plicit annotation of place. This paper describes a series of experiments to de- termine how positionally annotated mi- croblog posts can be used to learn loca- tion indicating words which then can be used to locate blog texts and their authors. A Gaussian distribution is used to model the locational qualities of words. We in- troduce the notion of placeness to describe how locational words are.

We find that modelling word distributions to account for several locations and thus several Gaussian distributions per word, defining a filter which picks out words with high placeness based on their local distributional context, and aggregating lo- cational information in a centroid for each text gives the most useful results. The re- sults are applied to data in the Swedish language. 

Place, publisher, year, edition, pages
Linköping University Electronic Press, 2015.
, Linköping Electronic Conference Proceedings, ISSN 1650-3740 ; 109
National Category
General Language Studies and Linguistics
Research subject
Information and Communication Technology
URN: urn:nbn:se:kth:diva-169619ISBN: 978-91-7519-098-3OAI: diva2:823404
NoDaLiDa,May 11–13, 2015 in Vilnius, Lithuania
SINUS (Spridning av innovationer i nutida svenska)
Swedish Research Council

Qc 20150618

Available from: 2015-06-18 Created: 2015-06-18 Last updated: 2015-06-18Bibliographically approved

Open Access in DiVA

fulltext(26529 kB)59 downloads
File information
File name FULLTEXT01.pdfFile size 26529 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Other links website

Search in DiVA

By author/editor
Karlgren, Jussi
By organisation
Theoretical Computer Science, TCS
General Language Studies and Linguistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 59 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 304 hits
ReferencesLink to record
Permanent link

Direct link