Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Community detection via random and adaptive sampling
KTH, School of Electrical Engineering (EES), Automatic Control. INRIA, France.
2014 (English)In: Journal of machine learning research, ISSN 1532-4435, E-ISSN 1533-7928, Vol. 35, 138-175 p.Article in journal (Refereed) Published
Abstract [en]

In this paper, we consider networks consisting of a finite number of non-overlapping communities. To extract these communities, the interaction between pairs of nodes may be sampled from a large available data set, which allows a given node pair to be sampled several times. When a node pair is sampled, the observed outcome is a binary random variable, equal to 1 if nodes interact and to 0 otherwise. The outcome is more likely to be positive if nodes belong to the same communities. For a given budget of node pair samples or observations, we wish to jointly design a sampling strategy (the sequence of sampled node pairs) and a clustering algorithm that recover the hidden communities with the highest possible accuracy. We consider both non-adaptive and adaptive sampling strategies, and for both classes of strategies, we derive fundamental performance limits satisfied by any sampling and clustering algorithm. In particular, we provide necessary conditions for the existence of algorithms recovering the communities accurately as the network size grows large. We also devise simple algorithms that accurately reconstruct the communities when this is at all possible, hence proving that the proposed necessary conditions for accurate community detection are also sufficient. The classical problem of community detection in the stochastic block model can be seen as a particular instance of the problems consider here. But our framework covers more general scenarios where the sequence of sampled node pairs can be designed in an adaptive manner. The paper provides new results for the stochastic block model, and extends the analysis to the case of adaptive sampling.

Place, publisher, year, edition, pages
2014. Vol. 35, 138-175 p.
Keyword [en]
Algorithms, Budget control, Population dynamics, Stochastic models, Stochastic systems, Adaptive sampling strategies, Binary random variables, Classical problems, Community detection, Fundamental performance limits, Overlapping communities, Sampling strategies, Stochastic block models, Clustering algorithms
National Category
Control Engineering
Identifiers
URN: urn:nbn:se:kth:diva-175660Scopus ID: 2-s2.0-84939637729OAI: oai:DiVA.org:kth-175660DiVA: diva2:862370
Conference
13 June 2014 through 15 June 2014
Note

QC 20151021

Available from: 2015-10-21 Created: 2015-10-19 Last updated: 2017-12-01Bibliographically approved

Open Access in DiVA

No full text

Scopus

Search in DiVA

By author/editor
Proutiere, Alexandre
By organisation
Automatic Control
In the same journal
Journal of machine learning research
Control Engineering

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 29 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf