Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
KTHFS – A HIGHLY AVAILABLE ANDSCALABLE FILE SYSTEM
KTH, School of Information and Communication Technology (ICT).
2013 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

KTHFS is a highly available and scalable file system built from the version 0.24 of the Hadoop Distributed File system. It provides a platform to overcome the limitations of existing distributed file systems. These limitations include scalability of metadata server in terms of memory usage, throughput and its availability.

This document describes KTHFS architecture and how it addresses these problems by providing a well coordinated distributed stateless metadata server (or in our case, Namenode) architecture. This is backed with the help of a persistence layer such as NDB cluster. Its primary focus is towards High Availability of the Namenode.

It achieves scalability and recovery by persisting the metadata to an NDB cluster. All namenodes are connected to this NDB cluster and hence are aware of the state of the file system at any point in time.

In terms of High Availability, KTHFS provides Multi-Namenode architecture. Since these namenodes are stateless and have a consistent view of the metadata, clients can issue requests on any of the namenodes. Hence, if one of these servers goes down, clients can retry its operation on the next available namenode.

We next discuss the evaluation of KTHFS in terms of its metadata capacity for medium and large size clusters, throughput and high availability of the Namenode and an analysis of the underlying NDBcluster.

Finally, we conclude this document with a few words on the ongoing and future work in KTHFS.

Place, publisher, year, edition, pages
2013. , 73 p.
Series
Trita-ICT-EX, 2013:30
Keyword [en]
Namenode, NDB cluster, MySQL cluster, KTHFS, HDFS, metadata, High Availability, Scalability, throughput
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:kth:diva-117918OAI: oai:DiVA.org:kth-117918DiVA: diva2:603878
Educational program
Master of Science - Software Engineering of Distributed Systems
Uppsok
Technology
Examiners
Available from: 2013-03-21 Created: 2013-02-07 Last updated: 2013-03-21Bibliographically approved

Open Access in DiVA

fulltext(1915 kB)292 downloads
File information
File name FULLTEXT01.pdfFile size 1915 kBChecksum SHA-512
991edb5c400023d012d76163643009661a5322dd81de1e7453fd8fccb5e4340471126e4a3a705ed1b2c0bfd9bcfdbc3a3a9a73998df19c75cd7ec7ac3e127636
Type fulltextMimetype application/pdf

By organisation
School of Information and Communication Technology (ICT)
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 292 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 236 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf