Change search
ReferencesLink to record
Permanent link

Direct link
A Gossiping Protocol for Detecting Global Threshold Crossings
KTH, School of Electrical Engineering (EES), Communication Networks. (ACCESS Linnaeus Centre)
KTH, School of Computer Science and Communication (CSC), Theoretical Computer Science, TCS. (ACCESS Linnaeus Centre)ORCID iD: 0000-0001-5432-6442
KTH, School of Electrical Engineering (EES), Communication Networks. (ACCESS Linnaeus Centre)
2010 (English)In: IEEE Transactions on Network and Service Management (TNSM), Vol. 7, no 1, 42-57 p.Article in journal (Refereed) Published
Abstract [en]

We investigate the use of gossip protocols for the detection of network-wide threshold crossings. Our design goals are low protocol overhead, small detection delay, low probability of false positives and negatives, scalability, robustness to node failures and controllability of the trade-off between overhead and detection delay. Based on push-synopses, a gossip protocol introduced by Kempe et al., we present a protocol that indicates whether a global aggregate of static local values is above or below a given threshold. For this protocol, we prove correctness and show that it converges to a state with no overhead when the aggregate is sufficiently far from the threshold. Then, we introduce an extension we call TG-GAP, a protocol that (1) executes in a dynamic network environment where local values change and (2) implements hysteresis behavior with upper and lower thresholds. Key elements of its design are the construction of snapshots of the global aggregate for threshold detection and a mechanism for synchronizing local states, both of which are realized through the underlying gossip protocol. Simulation studies suggest that TG-GAP is efficient in that the protocol overhead is minimal when the aggregate is sufficiently far from the threshold, that its overhead and the detection delay are largely independent on the system size, and that the tradeoff between overhead and detection quality can be effectively controlled. Lastly, we perform a comparative evaluation of TG-GAP against a tree-based protocol. We conclude that, for detecting global threshold crossings in the type of scenarios investigated, the tree-based protocol incurs a significantly lower overhead and a smaller detection delay than a gossip protocol such as TG-GAP.

Place, publisher, year, edition, pages
IEEE , 2010. Vol. 7, no 1, 42-57 p.
Keyword [en]
Distributed monitoring, threshold detection, gossip protocol
National Category
Computer Systems Communication Systems Telecommunications
URN: urn:nbn:se:kth:diva-86051DOI: 10.1109/TNSM.2010.I9P0329ScopusID: 2-s2.0-77249131803OAI: diva2:500379
© 2010 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. QC 20120215Available from: 2012-02-15 Created: 2012-02-13 Last updated: 2012-03-22Bibliographically approved
In thesis
1. Distributed Monitoring and Resource Management for Large Cloud Environments
Open this publication in new window or tab >>Distributed Monitoring and Resource Management for Large Cloud Environments
2010 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Over the last decade, the number, size and complexity of large-scale networked systems has been growing fast, and this trend is expected to accelerate. The best known example of a large-scale networked system is probably the Internet, while large datacenters for cloud services are the most recent ones. In such environments, a key challenge is to develop scalable and adaptive technologies for management functions. This thesis addresses the challenge by engineering several protocols  for distributed monitoring and resource management that are suitable for large-scale networked systems. First, we present G-GAP, a gossip-based protocol we developed for continuous monitoring of aggregates that are computed from device variables. We prove the robustness of this protocol to node failures and validate, through simulations, that its estimation accuracy does not change with increasing size of the monitored system under certain conditions. Second, we present TCA-GAP, a tree-based protocol, and TG-GAP, a gossip-based protocol for the purpose of monitoring threshold crossings of aggregates. For both protocols, we prove correctness properties and show, again through simulations, that both protocols are efficient, by showing that their overhead is at least two orders of magnitude smaller than that of a na\"ive approach, for cases where the monitored aggregate is sufficiently far from the threshold. Third, we present a gossip-based protocol for resource management in cloud environments. The protocol allocates CPU and memory resources to sites that are hosted by the cloud. We prove that the resource allocation computed by the protocol converges exponentially fast to an optimal allocation, for cases where sufficient memory is available. Through simulations, we show that the quality of the resource allocation approaches that of an ideal system when the total memory demand decreases significantly below the memory capacity of the entire system. In addition, we validate that the quality of the allocation does not change with increasing the number of hosted sites and machines, for the case where both metrics are scaled proportionally. Finally, we compare two approaches (tree-based and gossip-based) to engineering protocols for distributed management, for the case of real-time monitoring. Results of our simulation studies indicate that, regardless of the system size and failure rates in the monitored system, gossip protocols incur a significantly larger overhead than tree-based protocols for achieving the same monitoring quality (e.g., estimation accuracy or detection delay).

Place, publisher, year, edition, pages
Stockholm: KTH, 2010. vi, 26 p.
Trita-EE, ISSN 1653-5146 ; 2010:051
decentralized management, engineering protocols, distributed monitoring, resource management
National Category
Telecommunications Computer Science
urn:nbn:se:kth:diva-26207 (URN)978-91-7415-794-9 (ISBN)
Public defence
2010-12-10, Q2, Osquldas väg 10, plan 2, KTH, Stockholm, 14:00 (English)
QC 20101124Available from: 2010-11-24 Created: 2010-11-21 Last updated: 2012-03-22Bibliographically approved

Open Access in DiVA

fulltext(498 kB)321 downloads
File information
File name FULLTEXT01.pdfFile size 498 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopusIEEEXplore

Search in DiVA

By author/editor
Wuhib, FetahiDam, MadsStadler, Rolf
By organisation
Communication NetworksTheoretical Computer Science, TCS
Computer SystemsCommunication SystemsTelecommunications

Search outside of DiVA

GoogleGoogle Scholar
Total: 321 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 55 hits
ReferencesLink to record
Permanent link

Direct link