Persistence and Node FailureRecovery in Strongly Consistent Key-Value Datastore
Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Consistency preservation of replicated data is a critical aspect for distributed databaseswhich are strongly consistent. Further, in fail-recovery model each process also needs todeal with the management of stable storage and amnesia . CATS is a key/value datastore which combines the Distributed Hash Table (DHT) like scalability and selforganization and also provides atomic consistency of the replicated items. However beingan in memory data store with consistency and partition tolerance (CP), it suffers frompermanent unavailability in the event of majority failure.
The goals of this thesis were twofold (i) to implement disk persistent storage in CATS,which would allow the records and state of the nodes to be persisted on disk and (ii) todesign nodes failure recovery-algorithm for CATS which enable the system to run with theassumption of a Fail Recovery model without violating consistency.
For disk persistent storage two existing key/value databases LevelDB  and BerkleyDB are used. LevelDB is an implementation of log structured merged trees  where asBerkleyDB is an implementation of log structured B+ trees . Both have been used as anunderlying local storage for nodes and throughput and latency of the system with each isdiscussed. A technique to improve the performance by allowing concurrent operations onthe nodes is also discussed. The nodes failure-recovery algorithm is designed with a goalto allow the nodes to crash and then recover without violating consistency and also toreinstate availability once the majority of nodes recover. The recovery algorithm is based onpersisting the state variables of Paxos  acceptor and proposer and consistent groupmemberships.
For fault-tolerance and recovery, processes also need to copy records from the replicationgroup. This becomes problematic when the number of records and the amount of data ishuge. For this problem a technique for transferring key/value records in bulk is alsodescribed, and its effect on the latency and throughput of the system is discussed.
Place, publisher, year, edition, pages
2012. , 62 p.
Datastore, Distributed Hash Table (DHT), Atomic Shared Registers, Disk Storage, Consistency, Failure Recovery, Paxos, Group Membership
Engineering and Technology
IdentifiersURN: urn:nbn:se:kth:diva-121281OAI: oai:DiVA.org:kth-121281DiVA: diva2:617853
Master of Science - Software Engineering of Distributed Systems
Dowling, Jim, Associate professor