kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Graph-based Analytics for Decentralized Online Social Networks
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0002-0264-8762
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Decentralized Online Social Networks (DOSNs) have been introduced as a privacy preserving alternative to the existing online social networks.  DOSNs remove the dependency on a centralized provider and operate as distributed information management platforms. Current efforts of providing DOSNs are mainly focused on designing the required building blocks for managing the distributed network and supporting the social services (e.g., search, content delivery, etc.). However, there is a lack of reliable techniques for enabling complex analytical services (e.g., spam detection, identity validation, etc.) that comply with the decentralization requirements of DOSNs. In particular, there is a need for decentralized data analytic techniques and machine learning (ML) algorithms that can successfully run on top of DOSNs.

 

In this thesis, we empower decentralized analytics for DOSNs through a set of novel algorithms. Our algorithms allow decentralized analytics to effectively work on top of fully decentralized topology, when the data is fully distributed and nodes have access to their local knowledge only. Furthermore, our algorithms and methods are able to extract and exploit the latent patterns in the social user interaction networks and effectively combine them with the shared content, yielding significant improvements for the complex analytic tasks. We argue that, community identification is at the core of the learning and analytical services provided for DOSNs. We show in this thesis that knowledge on community structures and information dissemination patterns, embedded in the topology of social networks has a potential to greatly enhance data analytic insights and improve results. At the heart of this thesis lies a community detection technique that successfully extracts communities in a completely decentralized manner. In particular, we show that multiple complex analytic tasks, like spam detection and identity validation,  can be successfully tackled by harvesting the information from the social network structure. This is achieved by using decentralized community detection algorithm which acts as the main building block for the community-aware learning paradigm that we lay out in this thesis. To the best of our knowledge, this thesis represents the first attempt to bring complex analytical services, which require decentralized iterative computation over distributed data, to the domain of DOSNs. The experimental evaluation of our proposed algorithms using real-world datasets confirms the ability of our solutions to generate  efficient ML models in massively parallel and highly scalable manner.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2018. , p. 41
Series
TRITA-EECS-AVL ; 2018:4
Keywords [en]
Decentralized Community Detection, Community-aware Learning, Spam Detection, Identity Validation
National Category
Computer Systems
Research subject
Information and Communication Technology
Identifiers
URN: urn:nbn:se:kth:diva-222228ISBN: 978-91-7729-666-9 (print)OAI: oai:DiVA.org:kth-222228DiVA, id: diva2:1179959
Public defence
2018-03-09, sal C, Electrum building, Kistagången 16, STOCKHOLM, 09:00 (English)
Opponent
Supervisors
Note

QC 20180205

Available from: 2018-02-05 Created: 2018-02-02 Last updated: 2022-06-26Bibliographically approved
List of papers
1. Stad: Stateful Diffusion for Linear Time Community Detection
Open this publication in new window or tab >>Stad: Stateful Diffusion for Linear Time Community Detection
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Community detection is one of the preeminent topics in network analysis. Communities in real-world networks vary in their characteristics, such as their internal cohesion and size. Despite a large variety of methods proposed to detect communities so far, most of existing approaches fall into the category of global approaches. Specifically, these global approaches adapt their detection model focusing on approximating the global structure of the whole network, instead of performing approximation at the communities level. Global techniques tune their parameters to "one size fits all" model, so they are quite successful with extracting communities in homogeneous cases but suffer in heterogeneous community size distributions.

In this paper, we present a stateful diffusion approach (Stad) for community detection that employs diffusion. Stad boosts diffusion with conductance-based function that acts like a tuning parameter to control the diffusion speed. In contrast to existing diffusion mechanisms which operate with global and fixed speed, Stad introduces stateful diffusion to treat every community individually. Particularly, Stad controls the diffusion speed at node level, such that each node determines the diffusion speed associated with every possible community membership independently. Thus, Stad is able to extract communities more accurately in heterogeneous cases by dropping "one size fits all" model. Furthermore, Stad employs a vertex-centric approach which is fully decentralized and highly scalable, and requires no global knowledge. So as, Stad can be successfully applied in distributed environments, such as large-scale graph processing or decentralized machine learning. The results with both real-world and synthetic datasets show that Stad outperforms the state-of-the-art techniques, not only in the community size scale issue but also by achieving higher accuracy that is twice the accuracy achieved by the state-of-the-art techniques.

National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-222283 (URN)
Note

QC 20180205

Available from: 2018-02-05 Created: 2018-02-05 Last updated: 2022-06-26Bibliographically approved
2. Adagraph: Adaptive graph-based algorithms for spam detection in social networks
Open this publication in new window or tab >>Adagraph: Adaptive graph-based algorithms for spam detection in social networks
2017 (English)In: 5th International Conference on Networked Systems, NETYS 2017, Springer Verlag , 2017, p. 338-354Conference paper, Published paper (Refereed)
Abstract [en]

In the past years, researchers developed approaches to detect spam in Online Social Networks (OSNs) such as URL blacklisting, spam traps and even crowdsourcing for manual classification. Although previous work has shown the effectiveness of using statistical learning to detect spam, existing work employs supervised schemes that require labeled training data. In addition to the heavy training cost, it is difficult to obtain a comprehensive source of ground truth for measurement. In contrast to existing work, in this paper we present AdaGraph that is a novel graph-based approach for spam detection. AdaGraph is unsupervised, hence it diminishes the need of labeled training data and training cost. Particularly, AdaGraph effectively detects spam in large-scale OSNs by analyzing user behaviors using graph clustering technique. Moreover, AdaGraph continuously updates detected communities to comply with users dynamic interactions and activities. Extensive experiments using Twitter datasets show that AdaGraph detects spam with accuracy 92.3%. Furthermore, the false positive rate of AdaGraph is less than 0.3% that is less than half of the rate achieved by the state-of-the-art approaches.

Place, publisher, year, edition, pages
Springer Verlag, 2017
Keywords
Community detection, Distributed systems, Evolving graphs algorithms, Social networks, Unsupervised spam detection, Behavioral research, Graphic methods, Evolving graphs, Graph-based algorithms, Labeled training data, Online social networks (OSNs), Spam detection, State-of-the-art approach, Social networking (online)
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-216569 (URN)10.1007/978-3-319-59647-1_25 (DOI)2-s2.0-85019722417 (Scopus ID)9783319596464 (ISBN)
Conference
17 May 2017 through 19 May 2017
Note

QC 20171101

Available from: 2017-11-01 Created: 2017-11-01 Last updated: 2022-06-26Bibliographically approved
3. DLSAS: Distributed Large-Scale Anti-Spam Framework for Decentralized Online Social Networks
Open this publication in new window or tab >>DLSAS: Distributed Large-Scale Anti-Spam Framework for Decentralized Online Social Networks
2016 (English)In: 2016 IEEE 2ND INTERNATIONAL CONFERENCE ON COLLABORATION AND INTERNET COMPUTING (IEEE CIC), IEEE Press, 2016, p. 363-372Conference paper, Published paper (Refereed)
Abstract [en]

In the last decade, researchers and the open source community have proposed various Decentralized Online Social Networks (DOSNs) that remove dependency on centralized online social network providers to preserve user privacy. However, transitioning from centralized to decentralized environment creates various new set of problems, such as adversarial manipulations. In this paper, we present DLSAS, a novel unsupervised and decentralized anti-spam framework for DOSNs. DLSAS provides decentralized spam detection that is resilient to adversarial attacks. DLSAS typifies massively parallel frameworks and exploits fully decentralized learning and cooperative approaches. Furthermore, DLSAS provides a novel defense mechanism for DOSNs to prevent malicious nodes participating in the system by creating a validation overlay to assess the credibility of the exchanged information among the participating nodes and exclude the misbehaving nodes from the system. Extensive experiments using Twitter datasets confirm not only the DLSAS's capability to detect spam with higher accuracy compared to state-of-the-art approaches, but also the DLSAS's robustness against different adversarial attacks.

Place, publisher, year, edition, pages
IEEE Press, 2016
Keywords
Decentralized Online Social Networks, Spam Detection, Distributed Systems, System Integrity and Robustness.
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-213462 (URN)10.1109/CIC.2016.055 (DOI)000393501100041 ()2-s2.0-85013213049 (Scopus ID)978-1-5090-4607-2 (ISBN)
Conference
2nd IEEE International Conference on Collaboration and Internet Computing (IEEE CIC), NOV 01-03, 2016, Pittsburgh, PA
Note

QC 20170314

Available from: 2017-08-31 Created: 2017-08-31 Last updated: 2024-03-15Bibliographically approved
4. DIVa: Decentralized Identity Validation for Social Networks
Open this publication in new window or tab >>DIVa: Decentralized Identity Validation for Social Networks
Show others...
2015 (English)In: PROCEEDINGS OF THE 2015 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2015), Association for Computing Machinery (ACM), 2015, p. 383-391Conference paper, Published paper (Refereed)
Abstract [en]

Online Social Networks exploit a lightweight process to identify their users so as to facilitate their fast adoption. However, such convenience comes at the price of making legitimate users subject to different threats created by fake accounts. Therefore, there is a crucial need to empower users with tools helping them in assigning a level of trust to whomever they interact with. To cope with this issue, in this paper we introduce a novel model, DIVa, that leverages on mining techniques to find correlations among user profile attributes. These correlations are discovered not from user population as a whole, but from individual communities, where the correlations are more pronounced. DIVa exploits a decentralized learning approach and ensures privacy preservation as each node in the OSN independently processes its local data and is required to know only its direct neighbors. Extensive experiments using real-world OSN datasets show that DIVa is able to extract fine-grained community-aware correlations among profile attributes with average improvements up to 50% than the global approach.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2015
Keywords
Community-aware Identity Validation, Ensemble Learning, Privacy-preserving Learning, Decentralized Online Social Networks
National Category
Communication Studies
Identifiers
urn:nbn:se:kth:diva-185412 (URN)10.1145/2808797.2808861 (DOI)000371793500054 ()2-s2.0-84962492143 (Scopus ID)978-1-4503-3854-7 (ISBN)
Conference
IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), AUG 25-28, 2015, Paris, FRANCE
Note

QC 20160421

Available from: 2016-04-21 Created: 2016-04-18 Last updated: 2024-03-18Bibliographically approved
5. CADIVa: Cooperative and Adaptive Decentralized Identity Validation Model for Social Networks
Open this publication in new window or tab >>CADIVa: Cooperative and Adaptive Decentralized Identity Validation Model for Social Networks
Show others...
2016 (English)In: Social Network Analysis and Mining, ISSN 1869-5450, E-ISSN 1869-5469, Vol. 6, no 1, article id UNSP 36Article in journal (Refereed) Published
Abstract [en]

Online social networks (OSNs) have successfully changed the way people interact. Online interactions among people span geographical boundaries and interweave with different human life activities. However, current OSNs identification schemes lack guarantees on quantifying the trustworthiness of online identities of users joining them. Therefore, driven from the need to empower users with an identity validation scheme, we introduce a novel model, cooperative and adaptive decentralized identity validation CADIVa, that allows OSN users to assign trust levels to whomever they interact with. CADIVa exploits association rule mining approach to extract the identity correlations among profile attributes in every individual community in a social network. CADIVa is a fully decentralized and adaptive model that exploits fully decentralized learning and cooperative approaches not only to preserve users privacy, but also to increase the system reliability and to make it resilient to mono-failure. CADIVa follows the ensemble learning paradigm to preserve users privacy and employs gossip protocols to achieve efficient and low-overhead communication. We provide two different implementation scenarios of CADIVa. Results confirm CADIVa's ability to provide fine-grained community-aware identity validation with average improvement up to 36 and 50 % compared to the semi-centralized or global approaches, respectively.

Place, publisher, year, edition, pages
Springer, 2016
Keywords
Identity validation, Online social networks, Distributed systems, Privacy preservation, Decentralized online social networks
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:kth:diva-193150 (URN)10.1007/s13278-016-0343-z (DOI)000381220500036 ()2-s2.0-84976332626 (Scopus ID)
Projects
iSocial
Note

QC 20161003

Available from: 2016-10-03 Created: 2016-09-30 Last updated: 2024-03-15Bibliographically approved

Open Access in DiVA

fulltext(1280 kB)1785 downloads
File information
File name FULLTEXT01.pdfFile size 1280 kBChecksum SHA-512
10e8c3f530a7fc6895a64e2b7ff8c9deb3edd7f87494d083361dacb4b9610df3bc9cc198110b0c4463b9537ed2a2803f875b7bd3e2b2386b3751a2e7383e3a17
Type fulltextMimetype application/pdf

Authority records

Soliman, Amira

Search in DiVA

By author/editor
Soliman, Amira
By organisation
Software and Computer systems, SCS
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 1790 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 2941 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf