kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Tracking Geographical Locations using a Geo-Aware Topic Model for Analyzing Social Media Data
KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS. FOI Swedish Defence Research Agency, Sweden.
KTH, School of Computer Science and Communication (CSC). Google, Inc., United States.
KTH, School of Computer Science and Communication (CSC), Theoretical Computer Science, TCS. FOI Swedish Defence Research Agency, Sweden.ORCID iD: 0000-0002-2677-9759
2017 (English)In: Decision Support Systems, ISSN 0167-9236, E-ISSN 1873-5797, Vol. 99, no SI, p. 18-29Article in journal (Refereed) Published
Abstract [en]

Tracking how discussion topics evolve in social media and where these topics are discussed geographically over time has the potential to provide useful information for many different purposes. In crisis management, knowing a specific topic’s current geographical location could provide vital information to where, or even which, resources should be allocated. This paper describes an attempt to track online discussions geographically over time. A distributed geo-aware streaming latent Dirichlet allocation model was developed for the purpose of recognizing topics’ locations in unstructured text. To evaluate the model it has been implemented and used for automatic discovery and geographical tracking of election topics during parts of the 2016 American presidential primary elections. It was shown that the locations correlated with the actual election locations, and that the model provides a better geolocation classification compared to using a keyword-based approach.

Place, publisher, year, edition, pages
Elsevier, 2017. Vol. 99, no SI, p. 18-29
Keywords [en]
Social media, Topic modeling, Geo-awareness, Trend analysis, Latent Dirichlet allocation, Streaming media
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-210570DOI: 10.1016/j.dss.2017.05.006ISI: 000405162500003Scopus ID: 2-s2.0-85020801622OAI: oai:DiVA.org:kth-210570DiVA, id: diva2:1118834
Funder
EU, FP7, Seventh Framework Programme, 312649
Note

QC 20170704

Available from: 2017-07-02 Created: 2017-07-02 Last updated: 2024-05-14Bibliographically approved
In thesis
1. Toward automated veracity assessment of data from open sources using features and indicators
Open this publication in new window or tab >>Toward automated veracity assessment of data from open sources using features and indicators
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This dissertation hypothesizes that the key to automated veracity assessment of data from open sources is the careful estimation and extraction of relevant features and indicators. These features and indicators provide added value to a quantifiable veracity assessment, either directly or indirectly. The importance and usefulness of a veracity assessment largely depend on the specific situation and reason for which it is being conducted. Factors such as the recipient of the veracity assessment, the scope of the assessment, and the metrics used to measure accuracy and performance, all play a role in determining the value and perceived quality of the assessment.

Five peer-reviewed publications; two journal articles, two conference articles, and one workshop article, are included in this compilation thesis.

The main contributions of the work presented in this dissertation are: i) a compilation of challenges with manual methods of veracity assessment, ii) a road map for addressing the identified challenges, iii) identification of the state-of-the-art and gap analysis of veracity assessment of open-source data, iv) exploration of indicators such as topic geo-location tracking over time and stance classification, and v) evaluation of various feature types, model transferability, and style obfuscation attacks and the impact on accuracy for automated veracity assessment of a type of deception: fake reviews.

Abstract [sv]

Denna avhandling har som hypotes att nyckeln till automatiserad trovärdighetsbedömning av data från öppna källor ligger i det noggranna urvalet och estimeringen av relevanta särdrag och indikatorer. Dessa särdrag och indikatorer ger ett direkt eller indirekt mervärde till en kvantifierbar trovärdighetsbedömning. Betydelsen och användbarheten av en trovärdighetsbedömning beror till stor del på den specifika kontexten och anledningen till att den genomförs. Faktorer som mottagaren av trovärdighetsbedömningen, omfattningen av bedömningen och de mått som används för att mäta noggrannhet och prestanda, spelar alla in för att bestämma värdet och den upplevda kvalitén på bedömningen.

Fem referentgranskade publikationer ingår i denna sammanläggningsavhandling; två tidskriftsartiklar, två konferensartiklar och en workshopartikel.

De huvudsakliga bidragen från arbetet som presenteras i denna avhandling är: i) en sammanställning av utmaningar relaterade till manuella metoder för trovärdighetsbedömning, ii) en plan för att ta itu med de identifierade utmaningarna, iii) identifiering av forskningsfronten och en gapanalys av trovärdighetsbedömning av data från öppna källor, iv) studie av indikatorer såsom geolokalisering av ämnen och spårning av dem över tid samt klassificering av individers reaktioner i inlägg på sociala medier, och v) en utvärdering av särdragstyper som påverkar noggrannheten för automatisk trovärdighetsbedömning applicerat på en typ av bedrägeri: falska recensioner.

Place, publisher, year, edition, pages
Stockholm, Sweden: KTH Royal Institute of Technology, 2024. p. 71
Series
TRITA-EECS-AVL ; 2024:47
Keywords
Veracity assessment, natural language processing, machine learning, open-source data, Trovärdighetsbedömning, naturlig språkbehandling, maskininlärning, data från öppna källor
National Category
Software Engineering
Research subject
Information and Communication Technology
Identifiers
urn:nbn:se:kth:diva-346353 (URN)978-91-8040-927-8 (ISBN)
Public defence
2024-06-03, https://kth-se.zoom.us/j/63226866138, Sal C, Kistagången 16, Stockholm, 13:30 (English)
Opponent
Supervisors
Note

QC 20240514

Available from: 2024-05-14 Created: 2024-05-13 Last updated: 2024-05-21Bibliographically approved

Open Access in DiVA

fulltext(4764 kB)400 downloads
File information
File name FULLTEXT01.pdfFile size 4764 kBChecksum SHA-512
02c368f23d015dfcaa2887b0b23e42cab9e694aa79f9751dc382671a9fc527ff9e3b5e7463396ebd42eadfa0e50bf27dd1c0dcafa76e7bd75ad08d7516a00069
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

García Lozano, MarianelaBrynielsson, Joel

Search in DiVA

By author/editor
García Lozano, MarianelaSchreiber, JonahBrynielsson, Joel
By organisation
Software and Computer systems, SCSSchool of Computer Science and Communication (CSC)Theoretical Computer Science, TCS
In the same journal
Decision Support Systems
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 400 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 384 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf