Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Stad: Stateful Diffusion for Linear Time Community Detection
KTH, School of Electrical Engineering and Computer Science (EECS), Software and Computer systems, SCS.ORCID iD: 0000-0002-0264-8762
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Community detection is one of the preeminent topics in network analysis. Communities in real-world networks vary in their characteristics, such as their internal cohesion and size. Despite a large variety of methods proposed to detect communities so far, most of existing approaches fall into the category of global approaches. Specifically, these global approaches adapt their detection model focusing on approximating the global structure of the whole network, instead of performing approximation at the communities level. Global techniques tune their parameters to "one size fits all" model, so they are quite successful with extracting communities in homogeneous cases but suffer in heterogeneous community size distributions.

In this paper, we present a stateful diffusion approach (Stad) for community detection that employs diffusion. Stad boosts diffusion with conductance-based function that acts like a tuning parameter to control the diffusion speed. In contrast to existing diffusion mechanisms which operate with global and fixed speed, Stad introduces stateful diffusion to treat every community individually. Particularly, Stad controls the diffusion speed at node level, such that each node determines the diffusion speed associated with every possible community membership independently. Thus, Stad is able to extract communities more accurately in heterogeneous cases by dropping "one size fits all" model. Furthermore, Stad employs a vertex-centric approach which is fully decentralized and highly scalable, and requires no global knowledge. So as, Stad can be successfully applied in distributed environments, such as large-scale graph processing or decentralized machine learning. The results with both real-world and synthetic datasets show that Stad outperforms the state-of-the-art techniques, not only in the community size scale issue but also by achieving higher accuracy that is twice the accuracy achieved by the state-of-the-art techniques.

National Category
Computer Systems
Identifiers
URN: urn:nbn:se:kth:diva-222283OAI: oai:DiVA.org:kth-222283DiVA, id: diva2:1180409
Note

QC 20180205

Available from: 2018-02-05 Created: 2018-02-05 Last updated: 2018-02-05Bibliographically approved
In thesis
1. Graph-based Analytics for Decentralized Online Social Networks
Open this publication in new window or tab >>Graph-based Analytics for Decentralized Online Social Networks
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Decentralized Online Social Networks (DOSNs) have been introduced as a privacy preserving alternative to the existing online social networks.  DOSNs remove the dependency on a centralized provider and operate as distributed information management platforms. Current efforts of providing DOSNs are mainly focused on designing the required building blocks for managing the distributed network and supporting the social services (e.g., search, content delivery, etc.). However, there is a lack of reliable techniques for enabling complex analytical services (e.g., spam detection, identity validation, etc.) that comply with the decentralization requirements of DOSNs. In particular, there is a need for decentralized data analytic techniques and machine learning (ML) algorithms that can successfully run on top of DOSNs.

 

In this thesis, we empower decentralized analytics for DOSNs through a set of novel algorithms. Our algorithms allow decentralized analytics to effectively work on top of fully decentralized topology, when the data is fully distributed and nodes have access to their local knowledge only. Furthermore, our algorithms and methods are able to extract and exploit the latent patterns in the social user interaction networks and effectively combine them with the shared content, yielding significant improvements for the complex analytic tasks. We argue that, community identification is at the core of the learning and analytical services provided for DOSNs. We show in this thesis that knowledge on community structures and information dissemination patterns, embedded in the topology of social networks has a potential to greatly enhance data analytic insights and improve results. At the heart of this thesis lies a community detection technique that successfully extracts communities in a completely decentralized manner. In particular, we show that multiple complex analytic tasks, like spam detection and identity validation,  can be successfully tackled by harvesting the information from the social network structure. This is achieved by using decentralized community detection algorithm which acts as the main building block for the community-aware learning paradigm that we lay out in this thesis. To the best of our knowledge, this thesis represents the first attempt to bring complex analytical services, which require decentralized iterative computation over distributed data, to the domain of DOSNs. The experimental evaluation of our proposed algorithms using real-world datasets confirms the ability of our solutions to generate  efficient ML models in massively parallel and highly scalable manner.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2018. p. 41
Series
TRITA-EECS-AVL ; 2018:4
Keyword
Decentralized Community Detection, Community-aware Learning, Spam Detection, Identity Validation
National Category
Computer Systems
Research subject
Information and Communication Technology
Identifiers
urn:nbn:se:kth:diva-222228 (URN)978-91-7729-666-9 (ISBN)
Public defence
2018-03-09, sal C, Electrum building, Kistagången 16, STOCKHOLM, 09:00 (English)
Opponent
Supervisors
Note

QC 20180205

Available from: 2018-02-05 Created: 2018-02-02 Last updated: 2018-02-05Bibliographically approved

Open Access in DiVA

fulltext(4611 kB)7 downloads
File information
File name FULLTEXT01.pdfFile size 4611 kBChecksum SHA-512
f31217c8364e6753eb1456cabbfb0152258d7045f5dcda75e7912f800d85e1e34a3267614f1c2ef39d9b97b61fe15a66726c7089f1ab42088b01f1c417f90bae
Type fulltextMimetype application/pdf

Authority records BETA

Soliman, Amira

Search in DiVA

By author/editor
Soliman, Amira
By organisation
Software and Computer systems, SCS
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 7 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 23 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf