Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Kademlia on the Open Internet: How to Achieve Sub-Second Lookups in a Multimillion-Node DHT Overlay
KTH, School of Information and Communication Technology (ICT), Communication: Services and Infrastucture, Telecommunication Systems Laboratory, TSLab.
2011 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Distributed hash tables (DHTs) have gained much attention from the research community in the last years. Formal analysis and evaluations on simulators and small-scale deployments have shown good scalability and performance.

In stark contrast, performance measurements in large-scale DHT overlays on the Internet have yielded disappointing results, with lookup latencies measured in seconds. Others have attempted to improve lookup performance with very limited success, their lowest median lookup latency at over one second and a long tail of high-latency lookups.

In this thesis, the goal is to to enable large-scale DHT-based latency-sensitive applications on the Internet. In particular, we improve lookup latency in Mainline DHT, the largest DHT overlay on the open Internet, to identify and address practical issues on an existing system. Our approach is implementing and measuring backward-compatible modifications to facilitate their incremental adoption into Mainline DHT (and possibly other Kademlia-based overlays). Thus, enabling our research to have impact on a real-world system.

Our results close the performance gap between small- and large-scale DHT overlays. With a median lookup latency below 200 ms and a 99\superscript{th} percentile of just above 500 ms, our median lookup latency is one order of magnitude lower than the best performing measurement reported in the literature. Moreover, our results do not show a long tail of high-latency lookups, unlike previous reports.

We have achieved these results by studying how connectivity artifacts on the underlying network ---probably caused by firewalls and NAT devices on the Internet--- affect the DHT overlay. Our measurements of the connectivity of more than 3 million nodes reveal that connectivity artifacts are widespread and can severely degrade lookup performance.

Scalability and locality-awareness have also been explored in this thesis, where different mechanisms have been proposed. Some of the mechanisms are planned to be integrated into Mainline DHT in future work.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology , 2011. , x, 31 p.
Series
Trita-ICT-ECS AVH, ISSN 1653-6363 ; 11:10
Keyword [en]
dht, p2p, distributed systems, kademlia
National Category
Communication Systems
Identifiers
URN: urn:nbn:se:kth:diva-46469ISBN: 978-91-7501-153-0 (print)OAI: oai:DiVA.org:kth-46469DiVA: diva2:457407
Presentation
2011-12-09, C2, Isafjordsgatan 22, Kista, 10:00 (English)
Opponent
Supervisors
Note
QC 20111118Available from: 2011-11-18 Created: 2011-11-03 Last updated: 2011-11-18Bibliographically approved
List of papers
1. CTracker: a Distributed BitTorrent Tracker Based on Chimera
Open this publication in new window or tab >>CTracker: a Distributed BitTorrent Tracker Based on Chimera
2008 (English)In: Collaboration and the Knowledge Economy: Issues, Applications, Case Studies / [ed] Paul Cunningham and Miriam Cunningham, Amsterdam: IOS Press , 2008, 941-947 p.Conference paper, Published paper (Refereed)
Abstract [en]

There are three major open issues in the BitTorrent peer discovery system, which are not solved by any of the currently deployed solutions. These issues seriously threaten BitTorrent's scalability, especially when considering that mainstream content distributors could start using BitTorrent for distributing content to millions of users simultaneously in the near future.

In this paper these issues are addressed by proposing a topology-aware distributed tracking system as a replacement for both centralized and Kademlia-based trackers.

An experiment measuring most popular open BitTorrent trackers is also presented. It shows that centralized trackers are not topology aware. We conclude that an ideal topology-aware tracker would return peers whose latency to the requester peer is significantly lower than of a centralized tracker.

Place, publisher, year, edition, pages
Amsterdam: IOS Press, 2008
Keyword
dht, p2p, bittorrent
National Category
Communication Systems Computer Systems
Identifiers
urn:nbn:se:kth:diva-46463 (URN)978-1-58603-924-0 (ISBN)
Conference
eChallenges 2008
Projects
P2P-Next
Funder
EU, FP7, Seventh Framework Programme, 21617
Note
QC 20111117Available from: 2011-11-17 Created: 2011-11-03 Last updated: 2013-12-03Bibliographically approved
2. Connectivity Properties of Mainline BitTorrent DHT Nodes
Open this publication in new window or tab >>Connectivity Properties of Mainline BitTorrent DHT Nodes
2009 (English)In: 2009 IEEE NINTH INTERNATIONAL CONFERENCE ON PEER-TO-PEER COMPUTING (P2P 2009), NEW YORK: IEEE , 2009, 262-270 p.Conference paper, Published paper (Refereed)
Abstract [en]

The birth and evolution of Feer-to-Peer (P2P) protocols have, for the most part, been about peer discovery. Napster, one of the first P2P protocols, was basically FTP/HTTP plus a way of finding hosts willing to send you the file. Since then, both the transfer and peer discovery mechanisms have improved, but only recently have we seen a real push to completely decentralized peer discovery to increase scalability and resilience. Most such efforts are based on Distributed Hash Tables (DHTs), with Kademlia being a popular choice of DHT implementation. While sound in theory, and performing well in simulators and testbeds, the real-world performance often falls short of expectations. Our hypothesis is that the connectivity artifacts caused by guarded hosts (i.e., hosts behind firewalls and NATs) are the major cause for such poor performance. In this paper, the first steps towards testing this hypothesis are developed. First, we present a taxonomy of connectivity properties which will become the language used to accurately describe connectivity artifacts. Second, based on experiments "in the wild", we analyze the connectivity properties of over 3 million hosts. Finally, we match those properties to guarded host behavior and identify the potential effects on the DHT

Place, publisher, year, edition, pages
NEW YORK: IEEE, 2009
National Category
Computer and Information Science
Identifiers
urn:nbn:se:kth:diva-30017 (URN)10.1109/P2P.2009.5284530 (DOI)000274540500041 ()2-s2.0-73549117904 (Scopus ID)978-1-4244-5066-4 (ISBN)
Conference
9th International Conference on Peer-to-Peer Computing Seattle, WA, SEP 09-11, 2009
Note

QC 20110221

Available from: 2011-02-21 Created: 2011-02-21 Last updated: 2013-12-04Bibliographically approved
3. Sub-Second Lookups on a Large-Scale Kademlia-Based Overlay
Open this publication in new window or tab >>Sub-Second Lookups on a Large-Scale Kademlia-Based Overlay
2011 (English)In: 11th IEEE International Conference on Peer-to-Peer Computing 2011 (P2P’11), IEEE , 2011, 82-91 p.Conference paper, Published paper (Refereed)
Abstract [en]

Previous studies of large-scale (multimillion node) Kademlia-based DHTs have shown poor performance, measured in seconds, in contrast to the far more optimistic results from theoretical analysis, simulations and testbeds. In this paper, we unexpectedly find that in the Mainline BitTorrent DHT (MDHT), probably the largest DHT overlay on the Internet, many lookups already yield results in less than a second, albeit not consistently. With the backwards-compatible modifications we present, we show that not only can we reduce median latencies to between 100 and 200 ms, but also consistently achieve sub-second lookups. These results suggest that it is possible to deploy latency-sensitive applications on top of large-scale DHT overlays on the Internet, contrary to what some might have concluded based on previous results reported in the literature.

Place, publisher, year, edition, pages
IEEE, 2011
Series
IEEE International Conference on Peer-to-Peer Computing, ISSN 2161-3567
Keyword
dht, kademlia, performance, large-scale
National Category
Computer and Information Science
Identifiers
urn:nbn:se:kth:diva-38313 (URN)10.1109/P2P.2011.6038665 (DOI)000298838500010 ()2-s2.0-80055024024 (Scopus ID)978-1-4577-0149-8 (ISBN)
Conference
11th IEEE International Conference on Peer-to-Peer Computing 2011 (P2P’11)
Funder
EU, FP7, Seventh Framework Programme, No. 216217
Note
QC 20110825Available from: 2011-10-13 Created: 2011-08-24 Last updated: 2013-12-03Bibliographically approved

Open Access in DiVA

fulltext(1690 kB)601 downloads
File information
File name FULLTEXT01.pdfFile size 1690 kBChecksum SHA-512
4cec74f7d8a174b229196736018d110495fe53b404e7f567082912c0d0e33c02c6f909c1f6aeb8dac2dec11c242e1fe077745e6502f36855eb15353aba76e0af
Type fulltextMimetype application/pdf

Other links

http://people.kth.se/~rauljc/lic/

Search in DiVA

By author/editor
Jimenez, Raúl
By organisation
Telecommunication Systems Laboratory, TSLab
Communication Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 601 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 316 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf