Recent years have seen advances in building large Internet-scale index structures, generally known as structured overlays. Early structured overlays realized distributed hash tables (DHTs) which are ill suited for anything but exact queries. The need to support range queries necessitates systems that can handle uneven load distributions. However such systems suffer from practical problems including poor latency, disproportionate bandwidth usage at participating peers, or unrealistic assumptions on peers' homogeneity, in terms of available storage or bandwidth resources. In this article we consider a system that is not only able to support uneven load distributions but also to operate in heterogeneous environments, where each peer can autonomously decide how much of its resources to contribute to the system. We provide the theoretical foundations of realizing such a network and present a newly proposed system Oscar based on these principles. Oscar can construct efficient overlays given arbitrary load distributions by employing a novel scalable network sampling technique. The simulations of our system validate the theory and evaluate Oscar's performance under typical challenges, encountered in real-life large-scale networked systems, including participant heterogeneity, faults, and skewed and dynamic load-distributions. Thus the Oscar distributed index fills in an important gap in the family of structured overlays, bringing into life a practical Internet-scale index, which can play a crucial role in enabling data-oriented applications distributed over wide-area networks.
2010. Vol. 5, no 1