kth.sePublications
Change search
Link to record
Permanent link

Direct link
García Lozano, MarianelaORCID iD iconorcid.org/0000-0002-0408-1421
Publications (7 of 7) Show all publications
García Lozano, M. (2024). Toward automated veracity assessment of data from open sources using features and indicators. (Doctoral dissertation). Stockholm, Sweden: KTH Royal Institute of Technology
Open this publication in new window or tab >>Toward automated veracity assessment of data from open sources using features and indicators
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This dissertation hypothesizes that the key to automated veracity assessment of data from open sources is the careful estimation and extraction of relevant features and indicators. These features and indicators provide added value to a quantifiable veracity assessment, either directly or indirectly. The importance and usefulness of a veracity assessment largely depend on the specific situation and reason for which it is being conducted. Factors such as the recipient of the veracity assessment, the scope of the assessment, and the metrics used to measure accuracy and performance, all play a role in determining the value and perceived quality of the assessment.

Five peer-reviewed publications; two journal articles, two conference articles, and one workshop article, are included in this compilation thesis.

The main contributions of the work presented in this dissertation are: i) a compilation of challenges with manual methods of veracity assessment, ii) a road map for addressing the identified challenges, iii) identification of the state-of-the-art and gap analysis of veracity assessment of open-source data, iv) exploration of indicators such as topic geo-location tracking over time and stance classification, and v) evaluation of various feature types, model transferability, and style obfuscation attacks and the impact on accuracy for automated veracity assessment of a type of deception: fake reviews.

Abstract [sv]

Denna avhandling har som hypotes att nyckeln till automatiserad trovärdighetsbedömning av data från öppna källor ligger i det noggranna urvalet och estimeringen av relevanta särdrag och indikatorer. Dessa särdrag och indikatorer ger ett direkt eller indirekt mervärde till en kvantifierbar trovärdighetsbedömning. Betydelsen och användbarheten av en trovärdighetsbedömning beror till stor del på den specifika kontexten och anledningen till att den genomförs. Faktorer som mottagaren av trovärdighetsbedömningen, omfattningen av bedömningen och de mått som används för att mäta noggrannhet och prestanda, spelar alla in för att bestämma värdet och den upplevda kvalitén på bedömningen.

Fem referentgranskade publikationer ingår i denna sammanläggningsavhandling; två tidskriftsartiklar, två konferensartiklar och en workshopartikel.

De huvudsakliga bidragen från arbetet som presenteras i denna avhandling är: i) en sammanställning av utmaningar relaterade till manuella metoder för trovärdighetsbedömning, ii) en plan för att ta itu med de identifierade utmaningarna, iii) identifiering av forskningsfronten och en gapanalys av trovärdighetsbedömning av data från öppna källor, iv) studie av indikatorer såsom geolokalisering av ämnen och spårning av dem över tid samt klassificering av individers reaktioner i inlägg på sociala medier, och v) en utvärdering av särdragstyper som påverkar noggrannheten för automatisk trovärdighetsbedömning applicerat på en typ av bedrägeri: falska recensioner.

Place, publisher, year, edition, pages
Stockholm, Sweden: KTH Royal Institute of Technology, 2024. p. 71
Series
TRITA-EECS-AVL ; 2024:47
Keywords
Veracity assessment, natural language processing, machine learning, open-source data, Trovärdighetsbedömning, naturlig språkbehandling, maskininlärning, data från öppna källor
National Category
Software Engineering
Research subject
Information and Communication Technology
Identifiers
urn:nbn:se:kth:diva-346353 (URN)978-91-8040-927-8 (ISBN)
Public defence
2024-06-03, https://kth-se.zoom.us/j/63226866138, Sal C, Kistagången 16, Stockholm, 13:30 (English)
Opponent
Supervisors
Note

QC 20240514

Available from: 2024-05-14 Created: 2024-05-13 Last updated: 2024-05-21Bibliographically approved
Hansen, P., García Lozano, M., Kamrani, F. & Brynielsson, J. (2023). Real-time estimation of heart rate in situations characterized by dynamic illumination using remote photoplethysmography. In: Proceedings: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2023. Paper presented at 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2023, Vancouver, Canada, Jun 18 2023 - Jun 22 2023 (pp. 6094-6103). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Real-time estimation of heart rate in situations characterized by dynamic illumination using remote photoplethysmography
2023 (English)In: Proceedings: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2023, Institute of Electrical and Electronics Engineers (IEEE) , 2023, p. 6094-6103Conference paper, Published paper (Refereed)
Abstract [en]

Remote photoplethysmography (rPPG) is a technique that aims to remotely estimate the heart rate of an individual using an RGB camera. Although several studies use the rPPG methodology, it is usually applied in a laboratory in a controlled environment, where both the camera and the subject are static, and the illumination is ideal for the task. However, applying rPPG in a real-life scenario is much more demanding, since dynamic illumination issues arise. The work presented in this paper introduces a framework to estimate the heart rate of an individual in real-time using an RGB camera in a situation characterized by dynamic illumination. Such situations occur, for example, when either the camera or the subject is moving, and/or the face visibility is limited. The framework uses a face detection program to extract regions of interest on an individual's face. These regions are combined and constitute the input to a convolutional neural network, which is trained to estimate the heart rate in real-time. The method is evaluated on three publicly available datasets, and an in-house dataset specifically collected for the purpose of this study, that includes motions and dynamic illumination. The method shows good performance on all four datasets, outperforming other methods.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-337848 (URN)10.1109/CVPRW59228.2023.00649 (DOI)2-s2.0-85170820700 (Scopus ID)
Conference
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2023, Vancouver, Canada, Jun 18 2023 - Jun 22 2023
Note

Part of ISBN 9798350302493

QC 20231010

Available from: 2023-10-10 Created: 2023-10-10 Last updated: 2023-10-10Bibliographically approved
Garcia Lozano, M., Brynielsson, J., Franke, U., Rosell, M., Tjörnhammar, E., Varga, S. & Vlassov, V. (2020). Veracity assessment of online data. Decision Support Systems, 129, Article ID 113132.
Open this publication in new window or tab >>Veracity assessment of online data
Show others...
2020 (English)In: Decision Support Systems, ISSN 0167-9236, E-ISSN 1873-5797, Vol. 129, article id 113132Article in journal (Refereed) Published
Abstract [en]

Fake news, malicious rumors, fabricated reviews, generated images and videos, are today spread at an unprecedented rate, making the task of manually assessing data veracity for decision-making purposes a daunting task. Hence, it is urgent to explore possibilities to perform automatic veracity assessment. In this work we review the literature in search for methods and techniques representing state of the art with regard to computerized veracity assessment. We study what others have done within the area of veracity assessment, especially targeted towards social media and open source data, to understand research trends and determine needs for future research. The most common veracity assessment method among the studied set of papers is to perform text analysis using supervised learning. Regarding methods for machine learning much has happened in the last couple of years related to the advancements made in deep learning. However, very few papers make use of these advancements. Also, the papers in general tend to have a narrow scope, as they focus on solving a small task with only one type of data from one main source. The overall veracity assessment problem is complex, requiring a combination of data sources, data types, indicators, and methods. Only a few papers take on such a broad scope, thus, demonstrating the relative immaturity of the veracity assessment domain.

Place, publisher, year, edition, pages
Elsevier, 2020
Keywords
Veracity assessment, Credibility, Data quality, Online data, Social media, Fake news
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-268789 (URN)10.1016/j.dss.2019.113132 (DOI)000510956500001 ()2-s2.0-85076227196 (Scopus ID)
Note

QC 20200224

Available from: 2020-02-24 Created: 2020-02-24 Last updated: 2024-05-14Bibliographically approved
García Lozano, M. & Fernquist, J. (2019). Identifying deceptive reviews: Feature exploration, model transferability and classification attack. In: Proceedings of the 2019 European Intelligence and Security Informatics Conference, EISIC 2019: . Paper presented at 2019 European Intelligence and Security Informatics Conference, EISIC 2019, 26 November 2019 through 27 November 2019 (pp. 109-116). Institute of Electrical and Electronics Engineers Inc.
Open this publication in new window or tab >>Identifying deceptive reviews: Feature exploration, model transferability and classification attack
2019 (English)In: Proceedings of the 2019 European Intelligence and Security Informatics Conference, EISIC 2019, Institute of Electrical and Electronics Engineers Inc. , 2019, p. 109-116Conference paper, Published paper (Refereed)
Abstract [en]

The temptation to influence and sway public opinion most certainly increases with the growth of open online forums where anyone anonymously can express their views and opinions. Since online review sites are a popular venue for opinion influencing attacks, there is a need to automatically identify deceptive posts. The main focus of this work is on automatic identification of deceptive reviews, both positive and negative biased. With this objective, we build a deceptive review SVM based classification model and explore the performance impact of using different feature types (TF-IDF, word2vec, PCFG). Moreover, we study the transferability of trained classification models applied to review data sets of other types of products, and, the classifier robustness, i.e., the accuracy impact, against attacks by stylometry obfuscation trough machine translation. Our findings show that i) we achieve an accuracy of over 90% using different feature types, ii) the trained classification models do not perform well when applied on other data sets containing reviews of different products, and iii) machine translation only slightly impacts the results and can not be used as a viable attack method. 

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2019
Keywords
Classification, Deceptive, Fake, PCFG, SVM, Word2vec, Automation, Computational linguistics, Computer aided language translation, Social aspects, Support vector machines, Attack methods, Classification models, Feature types, Machine translations, Model transferabilities, Online reviews, Performance impact, Public opinions, Classification (of information)
National Category
Computer Sciences Natural Language Processing
Identifiers
urn:nbn:se:kth:diva-285414 (URN)10.1109/EISIC49498.2019.9108852 (DOI)2-s2.0-85087087979 (Scopus ID)
Conference
2019 European Intelligence and Security Informatics Conference, EISIC 2019, 26 November 2019 through 27 November 2019
Note

QC 20201130

Part of ISBN 9781728167350

Available from: 2020-11-30 Created: 2020-11-30 Last updated: 2025-02-01Bibliographically approved
García Lozano, M., Lilja, H., Tjörnhammar, E. & Karasalo, M. (2017). Mama Edha at SemEval-2017 Task 8: Stance Classification with CNN and Rules. In: ACL 2017 - 11th International Workshop on Semantic Evaluations, SemEval 2017, Proceedings of the Workshop: . Paper presented at 11th International Workshop on Semantic Evaluations, SemEval 2017, co-located with the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, Aug 4 2017 - Aug 3 2017 (pp. 481-485). Association for Computational Linguistics (ACL)
Open this publication in new window or tab >>Mama Edha at SemEval-2017 Task 8: Stance Classification with CNN and Rules
2017 (English)In: ACL 2017 - 11th International Workshop on Semantic Evaluations, SemEval 2017, Proceedings of the Workshop, Association for Computational Linguistics (ACL) , 2017, p. 481-485Conference paper, Published paper (Refereed)
Abstract [en]

For the competition SemEval-2017 we investigated the possibility of performing stance classification (support, deny, query or comment) for messages in Twitter conversation threads related to rumours. Stance classification is interesting since it can provide a basis for rumour veracity assessment. Our ensemble classification approach of combining convolutional neural networks with both automatic rule mining and manually written rules achieved a final accuracy of 74.9% on the competition's test data set for Task 8A. To improve classification we also experimented with data relabeling and using the grammatical structure of the tweet contents for classification.

Place, publisher, year, edition, pages
Association for Computational Linguistics (ACL), 2017
National Category
Computer Sciences Natural Language Processing
Identifiers
urn:nbn:se:kth:diva-332056 (URN)2-s2.0-85097656375 (Scopus ID)
Conference
11th International Workshop on Semantic Evaluations, SemEval 2017, co-located with the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, Aug 4 2017 - Aug 3 2017
Note

Part of ISBN 9781945626555

QC 20230719

Available from: 2023-07-19 Created: 2023-07-19 Last updated: 2025-02-01Bibliographically approved
García Lozano, M., Schreiber, J. & Brynielsson, J. (2017). Tracking Geographical Locations using a Geo-Aware Topic Model for Analyzing Social Media Data. Decision Support Systems, 99(SI), 18-29
Open this publication in new window or tab >>Tracking Geographical Locations using a Geo-Aware Topic Model for Analyzing Social Media Data
2017 (English)In: Decision Support Systems, ISSN 0167-9236, E-ISSN 1873-5797, Vol. 99, no SI, p. 18-29Article in journal (Refereed) Published
Abstract [en]

Tracking how discussion topics evolve in social media and where these topics are discussed geographically over time has the potential to provide useful information for many different purposes. In crisis management, knowing a specific topic’s current geographical location could provide vital information to where, or even which, resources should be allocated. This paper describes an attempt to track online discussions geographically over time. A distributed geo-aware streaming latent Dirichlet allocation model was developed for the purpose of recognizing topics’ locations in unstructured text. To evaluate the model it has been implemented and used for automatic discovery and geographical tracking of election topics during parts of the 2016 American presidential primary elections. It was shown that the locations correlated with the actual election locations, and that the model provides a better geolocation classification compared to using a keyword-based approach.

Place, publisher, year, edition, pages
Elsevier, 2017
Keywords
Social media, Topic modeling, Geo-awareness, Trend analysis, Latent Dirichlet allocation, Streaming media
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-210570 (URN)10.1016/j.dss.2017.05.006 (DOI)000405162500003 ()2-s2.0-85020801622 (Scopus ID)
Funder
EU, FP7, Seventh Framework Programme, 312649
Note

QC 20170704

Available from: 2017-07-02 Created: 2017-07-02 Last updated: 2024-05-14Bibliographically approved
Garcia Lozano, M., Franke, U., Rosell, M. & Vlassov, V. (2015). Towards Automatic Veracity Assessment of Open Source Information. In: 2015 IEEE International Congress on Big Data (BigData Congress): . Paper presented at Big Data (BigData Congress), 2015 IEEE International Congress on, New York, USA, June 27 - July 2, 2015. (pp. 199-206). IEEE Computer Society
Open this publication in new window or tab >>Towards Automatic Veracity Assessment of Open Source Information
2015 (English)In: 2015 IEEE International Congress on Big Data (BigData Congress), IEEE Computer Society, 2015, p. 199-206Conference paper, Published paper (Refereed)
Abstract [en]

Intelligence analysis is dependent on veracity assessment of Open Source Information (OSINF) which includes assessment of the reliability of sources and credibility of information. Traditionally, OSINF veracity assessment is done by intelligence analysts manually, but the large volumes, high velocity, and variety make it infeasible to continue doing so, and calls for automation. Based on meetings, interviews and questionnaires with military personnel, analysis of related work and state of the art, we identify the challenges and propose an approach and a corresponding framework for automated veracity assessment of OSINF. The framework provides a basis for new tools which will give the intelligence analysts the ability to automatically or semi-automatically assess veracity of larger amounts of data in a shorter amount of time. Instead of spending their time working with irrelevant, ambiguous, contradicting, biased, or plain wrong data, they can spend more time on analysis.

Place, publisher, year, edition, pages
IEEE Computer Society, 2015
Keywords
Big Data, public domain software, Big Data, OSINF, automatic data veracity assessment, intelligence analysis, open source information, Automation, Big data, Interviews, Probabilistic logic, Reliability, Semantics, Twitter, NATO STANAG 2511, OSINF, big data, data veracity, reliability and credibility, trust, veracity assessment
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-179402 (URN)10.1109/BigDataCongress.2015.36 (DOI)000380443700026 ()2-s2.0-84959501022 (Scopus ID)978-1-4673-7277-0 (ISBN)
External cooperation:
Conference
Big Data (BigData Congress), 2015 IEEE International Congress on, New York, USA, June 27 - July 2, 2015.
Note

QC 20151217

Available from: 2015-12-16 Created: 2015-12-16 Last updated: 2024-05-14Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-0408-1421

Search in DiVA

Show all publications