Change search
ReferencesLink to record
Permanent link

Direct link
Streaming Graph Partitioning: Degree Project in Distributed Computing at KTH Information and Communication Technology
KTH, School of Information and Communication Technology (ICT).
2016 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Graph partitioning is considered to be a standard solution to process huge graphs efficiently when processing them on a single machine becomes inefficient due to its limited computation power and storage space. In graph partitioning, the whole graph is divided among different computing nodes that process the graph in parallel. During the early stages of research done on graph partitioning, different offline partitioning methods were introduced; these methods create high computation cost as they process the whole graph prior to partitioning. Therefore, an online graph partitioning method called as streaming graph partitioning was introduced later to reduce the computation cost by assigning the edges or vertices on-the-fly to the computing nodes without processing the graph before partitioning. In our thesis, we presented an experimental study of different streaming graph partitioning methods that use two partitioning techniques: vertex partitioning and edge partitioning. Edge partitioning has proved good for partitioning highly skewed graphs. After implementing different partitioning methods, we have proposed a partitioning algorithm that uses degree information of the vertices. Furthermore, we measured the effect of different partitioning methods on the graph stream processing algorithms. Our results show that for vertex partitioning Fennel has performed better than Linear Greedy as it shows lower edge-cuts and better load balancing. Moreover, for edge partitioning, the Degree based partitioner has performed better than Least Cost Incremental and Least Cost Incremental Advanced in reducing the replication factor, but the Degree based partitioner does not do well in load balancing. In the end, we show that the custom partitioning methods, compared to default hash partitioning, save the memory space by reducing the size of aggregate states during execution of different graph processing algorithms on the resulting partitions. The Degree based partitioner performed well by reducing the size of aggregate states on average up to 50%. Other algorithms include: Fennel, Linear Greedy, Least Cost Incremental and Least Cost Incremental Advanced, they reduced the size of aggregate states on average up to 21%, 10%, 27% and 48%.

Place, publisher, year, edition, pages
2016. , 71 p.
National Category
Engineering and Technology
URN: urn:nbn:se:kth:diva-190895OAI: diva2:953624
Subject / course
Computer Science
Educational program
Master of Science - Distributed Computing
Available from: 2016-08-18 Created: 2016-08-18 Last updated: 2016-08-24Bibliographically approved

Open Access in DiVA

fulltext(2005 kB)41 downloads
File information
File name FULLTEXT01.pdfFile size 2005 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
School of Information and Communication Technology (ICT)
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 41 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 588 hits
ReferencesLink to record
Permanent link

Direct link