Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Streaming Graph Partitioning: An Experimental Study
KTH, School of Electrical Engineering and Computer Science (EECS), Software and Computer systems, SCS.ORCID iD: 0000-0001-6171-9586
Systems Group, ETH, Zurich, Switzerland.
KTH, School of Electrical Engineering and Computer Science (EECS), Software and Computer systems, SCS.
KTH, School of Electrical Engineering and Computer Science (EECS), Software and Computer systems, SCS.ORCID iD: 0000-0002-6779-7435
2018 (English)In: Proceedings of the VLDB Endowment, ISSN 2150-8097, E-ISSN 2150-8097, Vol. 11, no 11, p. 1590-1603Article in journal (Refereed) Published
Abstract [en]

Graph partitioning is an essential yet challenging task for massive graph analysis in distributed computing. Common graph partitioning methods scan the complete graph to obtain structural characteristics offline, before partitioning. However, the emerging need for low-latency, continuous graph analysis led to the development of online partitioning methods. Online methods ingest edges or vertices as a stream, making partitioning decisions on the fly based on partial knowledge of the graph. Prior studies have compared offline graph partitioning techniques across different systems. Yet, little effort has been put into investigating the characteristics of online graph partitioning strategies.

In this work, we describe and categorize online graph partitioning techniques based on their assumptions, objectives and costs. Furthermore, we employ an experimental comparison across different applications and datasets, using a unified distributed runtime based on Apache Flink. Our experimental results showcase that model-dependent online partitioning techniques such as low-cut algorithms offer better performance for communication-intensive applications such as bulk synchronous iterative algorithms, albeit higher partitioning costs. Otherwise, model-agnostic techniques trade off data locality for lower partitioning costs and balanced workloads which is beneficial when executing data-parallel single-pass graph algorithms.

Place, publisher, year, edition, pages
ACM Digital Library, 2018. Vol. 11, no 11, p. 1590-1603
Keywords [en]
graph partitioning, streaming graph processing, stream processing
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-236053DOI: 10.14778/3236187.3236208ISI: 000452537300022Scopus ID: 2-s2.0-85058876764OAI: oai:DiVA.org:kth-236053DiVA, id: diva2:1255661
Note

QC 20181015

Available from: 2018-10-14 Created: 2018-10-14 Last updated: 2019-01-04Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Abbas, ZainabCarbone, ParisVlassov, Vladimir
By organisation
Software and Computer systems, SCS
In the same journal
Proceedings of the VLDB Endowment
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 697 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf