PACMan: Coordinated memory caching for parallel jobsShow others and affiliations
2012 (English)In: Proceedings of NSDI 2012: 9th USENIX Symposium on Networked Systems Design and Implementation, USENIX Association , 2012, p. 267-280Conference paper, Published paper (Refereed)
Abstract [en]
Data-intensive analytics on large clusters is important for modern Internet services. As machines in these clusters have large memories, in-memory caching of inputs is an effective way to speed up these analytics jobs. The key challenge, however, is that these jobs run multiple tasks in parallel and a job is sped up only when inputs of all such parallel tasks are cached. Indeed, a single task whose input is not cached can slow down the entire job. To meet this "all-or-nothing" property, we have built PACMan, a caching service that coordinates access to the distributed caches. This coordination is essential to improve job completion times and cluster efficiency. To this end, we have implemented two cache replacement policies on top of PACMan's coordinated infrastructure fb-LIFE that minimizes average completion time by evicting large incomplete inputs, and LFU-F that maximizes cluster efficiency by evicting less frequently accessed inputs. Evaluations on production workloads from Facebook and Microsoft Bing show that PACMan reduces average completion time of jobs by 56% and 51% (small interactive jobs improve by 77%), and improves efficiency of the cluster by 47% and 54%, respectively.
Place, publisher, year, edition, pages
USENIX Association , 2012. p. 267-280
Keywords [en]
Systems analysis, All or nothings, Cache replacement policy, Caching services, Completion time, Distributed cache, Internet services, Large clusters, Production workloads, Efficiency
National Category
Computer Engineering
Identifiers
URN: urn:nbn:se:kth:diva-314740Scopus ID: 2-s2.0-84919827070OAI: oai:DiVA.org:kth-314740DiVA, id: diva2:1675650
Conference
9th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2012, 25 April 2012 through 27 April 2012, San Jose, California
Note
QC 20220623
Part of proceedings: ISBN 978-931971-92-8
2022-06-232022-06-232022-06-25Bibliographically approved