kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Distributed Hierarchical File Systems strike back in the Cloud
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS. Logical Clocks AB.ORCID iD: 0000-0002-6578-3902
Show others and affiliations
2020 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Cloud service providers have aligned on availability zones as an important unit of failure and replication for storage systems. An availability zone (AZ) has independent power, networking, and cooling systems and consists of one or more data centers. Multiple AZs in close geographic proximity form a region that can support replicated low latency storage services that can survive the failure of one or more AZs. Recent reductions in inter-AZ latency have made synchronous replication protocols increasingly viable, instead of traditional quorum-based replication protocols. We introduce HopsFS-CL, a distributed hierarchical file system with support for high- availability (HA) across AZs, backed by AZ-aware synchronously replicated metadata and AZ-aware block replication. HopsFS-CL is a redesign of HopsFS, a version of HDFS with distributed metadata, and its design involved making replication protocols and block placement protocols AZ-aware at all layers of its stack: the metadata serving, the metadata storage, and block storage layers. In experiments on a real-world workload from Spotify, we show that HopsFS-CL, deployed in HA mode over 3 AZs, reaches 1.66 million ops/s, and has similar performance to HopsFS when deployed in a single AZ, while preserving the same semantics.

Place, publisher, year, edition, pages
2020.
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:kth:diva-280786OAI: oai:DiVA.org:kth-280786DiVA, id: diva2:1467134
Conference
40th IEEE International Conference on Distributed Computing Systems, November 29 - December 1, 2020, Singapore
Note

QC 20210120

Available from: 2020-09-14 Created: 2020-09-14 Last updated: 2022-06-25Bibliographically approved
In thesis
1. Distributed File System Metadata and its Applications
Open this publication in new window or tab >>Distributed File System Metadata and its Applications
2020 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Distributed hierarchical file systems typically decouple the storage and serving of the file metadata from the file contents (file system blocks) to enable the file system to scale to store more data and support higher throughput. We designed HopsFS to take the scalability of the file system one step further by also decoupling the storage and serving of the file system metadata. HopsFS is an open-source, next- generation distribution of the Apache Hadoop Distributed File System (HDFS) that replaces the main scalability bottleneck in HDFS, the single-node in-memory metadata service, with a distributed metadata service built on a NewSQL database (NDB). HopsFS stores the file system’s metadata fully normalized in NDB, then it uses locking primitives and application-defined locks to ensure strongly consistent metadata.In this thesis, we leverage the consistent distributed hierarchical file system meta- data provided by HopsFS to efficiently build new classes of applications that are tightly coupled with the file system as well as to improve the internal file system operations. First, we introduce hbr, a new block reporting protocol for HopsFS that removes a scalability bottleneck that prevented HopsFS from scaling to tens of thousands of servers. Second, we introduce HopsFS-CL, a highly available cloud-native distribution of HopsFS that deploys the file system across Availability Zones in the cloud while maintaining the same file system semantics. Third, we introduce HopsFS-S3, a highly available cloud-native distribution of HopsFS that uses object stores as a backend for the block storage layer in the cloud while again maintaining the same file system semantics. Fourth, we introduce ePipe, a databus that both creates a consistent change stream for HopsFS and eventually delivers the correctly ordered stream with low latency to downstream clients. That is, ePipe extends HopsFS with a change-data-capture (CDC) API that provides not only efficient file system notifications but also enables polyglot storage for file system metadata. Polyglot storage enables us to offload metadata queries to a more appropriate engine - we use Elasticsearch to provide a free-text search of the file system namespace to demonstrate this capability. Finally, we introduce Hopsworks, a scalable, project-based multi-tenant big data platform that provides support for collaborative development and operations for teams through extended metadata.

Abstract [sv]

Distribuerade hierarkiska filsystem kopplar vanligtvis bort lagring och hanteringen av filens metadata från filens innehåll (filsystemets block) för att göra det möjligt för filsystemet att skala bättre för att lagra mer data och stödja högre genomströmn- ing. Vi utformade HopsFS för att ta skalbarheten i filsystemet ett steg längre genom att även koppla bort lagring och hantering av filsystemets metadata. HopsFS är en öppen källkod, nästa generations distribution av Apache Hadoop Distribuerade Filsystem (HDFS) som ersätter den huvudsakliga skalbarhetsflaskhalsen i HDFS, en nod som lagrar all metadata i minnet, med en distribuerad metadatatjänst byggd på en NewSQL-databas (NDB). HopsFS lagrar filsystemets metadata fullt normaliserat i NDB, och använder sedan låsande primitiver och applikations- definierade lås för att säkerställa starkt konsistent metadata. I denna avhandling använder vi den konsistenta distribuerade hierarkiska filsys- temmetadata som tillhandahålls av HopsFS för att effektivt bygga nya klasser av applikationer som är tätt kopplade till filsystemet samt för att förbättra filsystemets interna funktioner. Först introducerar vi hbr, ett nytt blockrapporteringsprotokoll för HopsFS som tar bort en skalbarhetsflaskhals som hindrade HopsFS från att skalas till tiotusentals servrar. För det andra introducerar vi HopsFS-CL, en mycket tillgänglig molnbaserad distribution av HopsFS som distribuerar filsystemet över tillgänglighetszoner i molnet samtidigt som samma filsystemsemantik bibehålls. För det tredje introducerar vi HopsFS-S3, en mycket tillgänglig molnbaserad distribution av HopsFS som använder objektlagring som en backend för block- lagringslagret i molnet samtidigt som samma filsystemsemantik bibehålls. För det fjärde introducerar vi ePipe, en databus som båda skapar en konsistent förän- dringsström för HopsFS och så levererar korrekt beställd ström med låg latens till nedströmsklienter. Det vill säga ePipe utökar HopsFS med ett CDC-API (Change- data-capture) som inte bara ger effektiva filsystemmeddelanden utan också möjlig- gör polyglot-lagring för filsystemets metadata. Med polyglot-lagring kan vi avlasta metadatafrågor till en mer lämplig sökmotor - vi använder Elasticsearch för att tillhandahålla en fritext-sökning i filsystemets namnområde för att visa denna förmåga. Slutligen introducerar vi Hopsworks, en skalbar, projektbaserad big data-plattform om stödjer flera användare och ger stöd för samarbetsutveckling och drift för team med hjälp av utökad metadata.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2020
Series
TRITA-EECS-AVL ; 2020:65
National Category
Computer Systems
Research subject
Information and Communication Technology
Identifiers
urn:nbn:se:kth:diva-285872 (URN)978-91-7873-702-4 (ISBN)
Public defence
2020-12-18, https://kth-se.zoom.us/webinar/register/WN_rs7eHk0eQT2KTqefxHYcZw, Sal C, Electrum, Kistagången 16, Stockholm, 09:00 (English)
Opponent
Supervisors
Note

QC 20201111

Available from: 2020-11-11 Created: 2020-11-11 Last updated: 2022-06-25Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

https://icdcs2020.sg/index.htmlhttps://www.youtube.com/watch?v=4Wv2bQFI1k8&feature=youtu.be

Authority records

Ismail, MahmoudHaridi, SeifDowling, Jim

Search in DiVA

By author/editor
Ismail, MahmoudHaridi, SeifDowling, Jim
By organisation
Software and Computer systems, SCS
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 206 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf