kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
GCNSplit: Bounding the State of Streaming Graph Partitioning
KTH, School of Electrical Engineering and Computer Science (EECS).
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0001-6171-9586
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0002-8573-0090
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0002-9351-8508
Show others and affiliations
2022 (English)In: Proceedings of the 5th International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, aiDM 2022 - In conjunction with the 2022 ACM SIGMOD/PODS Conference, Association for Computing Machinery, Inc , 2022, article id 3Conference paper, Published paper (Refereed)
Abstract [en]

This paper introduces GCNSplit, a streaming graph partitioning framework capable of handling unbounded streams with bounded state requirements. We frame partitioning as a classification problem and we employ an unsupervised model whose loss function minimizes edge-cuts. GCNSplit leverages an inductive graph convolutional network (GCN) to embed graph characteristics into a low-dimensional space and assign edges to partitions in an online manner. We evaluate GCNSplit with real-world graph datasets of various sizes and domains. Our results demonstrate that GCNSplit provides high-throughput, top-quality partitioning, and successfully leverages data parallelism. It achieves a throughput of 430K edges/s on a real-world graph of 1.6B edges using a bounded 147KB-sized model, contrary to the state-of-the-art HDRF algorithm that requires > 116GB in-memory state. With a well-balanced normalized load of 1.01, GCNSplit achieves a replication factor on par with HDRF, showcasing high partitioning quality while storing three orders of magnitude smaller partitioning state. Owing to the power of GCNs, we show that GCNSplit can generalize to entirely unseen graphs while outperforming the state-of-the-art stream partitioners in some cases.

Place, publisher, year, edition, pages
Association for Computing Machinery, Inc , 2022. article id 3
Keywords [en]
data streams, graph neural networks, graph partitioning
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-317564DOI: 10.1145/3533702.3534920Scopus ID: 2-s2.0-85137089721OAI: oai:DiVA.org:kth-317564DiVA, id: diva2:1695576
Conference
5th International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, aiDM 2022 - In conjunction with the 2022 ACM SIGMOD/PODS Conference, 17 June 2022, Philadelphia, USA
Note

QC 20220914

Part of proceedings: ISBN 978-145039377-5

Available from: 2022-09-14 Created: 2022-09-14 Last updated: 2022-09-14Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Zwolak, MichalAbbas, ZainabHorchidan, Sonia-FlorinaCarbone, Paris

Search in DiVA

By author/editor
Zwolak, MichalAbbas, ZainabHorchidan, Sonia-FlorinaCarbone, Paris
By organisation
School of Electrical Engineering and Computer Science (EECS)Software and Computer systems, SCS
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 122 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf