Change search
ReferencesLink to record
Permanent link

Direct link
Scaling YARN: A Distributed Resource Manager for Hadoop
KTH, School of Information and Communication Technology (ICT).
2014 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

In recent years, there has been a growing need for computer systems that are capable of handling unprecedented amounts of data. To this end, Hadoop HDFS and Hadoop YARN have become the de facto standard for meeting demanding storage requirements and for managing applications that can process this data. Although YARN is a major advancement from its predecessor MapReduce in terms of scalability and fault-tolerance, its Resource Manager component that performs resource allocation introduces a potential single point of failure and a performance bottleneck due to its centralized architecture. This thesis presents a novel architecture in which the Resource Manager runs on a distributed network of stateless commodity machines as its state is migrated to MySQL Cluster, a relational write-scalable and highly available in-memory database. By doing so, the Resource Manager becomes more scalable as it can now run on multiple nodes as well as more fault-tolerant as arbitrary node failures do not result in state loss. In this work we implemented the proposed architecture for the Resource Tracker service which performs cluster node management for the Resource Manager. Experimental results validate the correctness of our proposal, demonstrate how it scales well by utilizing stateless Resource Manager machines and evaluate its performance in terms of request throughput, system resource and database utilization.

Place, publisher, year, edition, pages
2014. , 82 p.
TRITA-ICT-EX, 2014:97
National Category
Computer and Information Science
URN: urn:nbn:se:kth:diva-177200OAI: diva2:871976
Available from: 2015-12-08 Created: 2015-11-17 Last updated: 2015-12-08Bibliographically approved

Open Access in DiVA

No full text

By organisation
School of Information and Communication Technology (ICT)
Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 30 hits
ReferencesLink to record
Permanent link

Direct link