Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Persistence and Node FailureRecovery in Strongly Consistent Key-Value Datastore
KTH, School of Information and Communication Technology (ICT).
2012 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Consistency preservation of replicated data is a critical aspect for distributed databaseswhich are strongly consistent. Further, in fail-recovery model each process also needs todeal with the management of stable storage and amnesia [1]. CATS is a key/value datastore which combines the Distributed Hash Table (DHT) like scalability and selforganization and also provides atomic consistency of the replicated items. However beingan in memory data store with consistency and partition tolerance (CP), it suffers frompermanent unavailability in the event of majority failure.

The goals of this thesis were twofold (i) to implement disk persistent storage in CATS,which would allow the records and state of the nodes to be persisted on disk and (ii) todesign nodes failure recovery-algorithm for CATS which enable the system to run with theassumption of a Fail Recovery model without violating consistency.

For disk persistent storage two existing key/value databases LevelDB [2] and BerkleyDB[3] are used. LevelDB is an implementation of log structured merged trees [4] where asBerkleyDB is an implementation of log structured B+ trees [5]. Both have been used as anunderlying local storage for nodes and throughput and latency of the system with each isdiscussed. A technique to improve the performance by allowing concurrent operations onthe nodes is also discussed. The nodes failure-recovery algorithm is designed with a goalto allow the nodes to crash and then recover without violating consistency and also toreinstate availability once the majority of nodes recover. The recovery algorithm is based onpersisting the state variables of Paxos [6] acceptor and proposer and consistent groupmemberships.

For fault-tolerance and recovery, processes also need to copy records from the replicationgroup. This becomes problematic when the number of records and the amount of data ishuge. For this problem a technique for transferring key/value records in bulk is alsodescribed, and its effect on the latency and throughput of the system is discussed.

Place, publisher, year, edition, pages
2012. , 62 p.
Series
Trita-ICT-EX, 2012:175
Keyword [en]
Datastore, Distributed Hash Table (DHT), Atomic Shared Registers, Disk Storage, Consistency, Failure Recovery, Paxos, Group Membership
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:kth:diva-121281OAI: oai:DiVA.org:kth-121281DiVA: diva2:617853
Educational program
Master of Science - Software Engineering of Distributed Systems
Uppsok
Technology
Examiners
Available from: 2013-04-24 Created: 2013-04-24 Last updated: 2013-04-24Bibliographically approved

Open Access in DiVA

fulltext(1449 kB)307 downloads
File information
File name FULLTEXT01.pdfFile size 1449 kBChecksum SHA-512
6b1f29e8db56bf7d6e67f2ecfca420bf8e88e49a1cd2ab33c8a09700093d1ab2a7f2da56c919f79089d6a9489462521177cedba78e7a271982058b9b927f2032
Type fulltextMimetype application/pdf

By organisation
School of Information and Communication Technology (ICT)
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 307 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 125 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf