Asymmetry in Large-Scale Graph Analysis, Explained
2014 (English)In: Proceedings of the Second International Workshop on Graph Data ManagementExperience and Systems (GRADES 2014), June 22, 2014, Snowbird, Utah, USA., 2014Conference paper (Refereed)
Iterative computations are in the core of large-scale graph processing. In these applications, a set of parameters is continuously refined, until a fixed point is reached. Such fixed point iterations often exhibit non-uniform computational behavior, where changes propagate with different speeds throughout the parameter set, making them active or inactive during iterations. This asymmetrical behavior can lead to a many redundant computations, if not exploited. Many specialized graph processing systems and APIs exist that run iterative algorithms efficiently exploiting this asymmetry. However, their functionality is sometimes vaguely defined and due to their different programming models and terminology used, it is often challenging to derive equivalence between them. We describe an optimization framework for iterative graph processing, which utilizes dataset dependencies. We explain several optimization techniques that exploit asymmetrical behavior of graph algorithms. We formally specify the conditions under which, an algorithm can use a certain technique. We also design template execution plans, using a canonical set of dataflow operators and we evaluate them using real-world datasets and applications. Our experiments show that optimized plans can significantly reduce execution time, often by an order of magnitude. Based on our experiments, we identify a trade-off that can be easily captured and could serve as the basis for automatic optimization of large-scale graph-processing applications.
Place, publisher, year, edition, pages
Engineering and Technology Computer Systems
IdentifiersURN: urn:nbn:se:kth:diva-145335DOI: 10.11452621934.2621940ScopusID: 2-s2.0-84905640971ISBN: 978-145032982-8OAI: oai:DiVA.org:kth-145335DiVA: diva2:717733
2nd International Workshop on Graph Data Management Experiences and Systems, GRADES 2014 - Co-located with SIGMOD/PODS 2014; Snowbird, UT, United States, 22 June 2014 - 27 June 2014
QC 201503092014-05-162014-05-162015-03-09Bibliographically approved