kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Real-Time Monitoring of Global Variables in Large-Scale Dynamic Systems
KTH, School of Electrical Engineering (EES), Communication Networks.
2007 (English)Licentiate thesis, comprehensive summary (Other scientific)
Abstract [en]

Large-scale dynamic systems, such as the Internet, as well as emerging peer-to-peer networks and computational grids, require a high level of awareness of the system state in real-time for proper and reliable operation. A key challenge is to develop monitoring functions that are efficient, scalable, robust and controllable. The thesis addresses this challenge by focusing on engineering protocols for distributed monitoring of global state variables. The global variables are network-wide aggregates, computed from local device variables using aggregation functions such as SUM, MAX, AVERAGE, etc. Furthermore, it addresses the problem of detecting threshold crossing of such aggregates. The design goals for the protocols are efficiency, quality, scalability, robustness and controllability. The work presented in this thesis has resulted in two novel protocols: a gossip-based protocol for continuous monitoring of aggregates called G-GAP, and a tree-based protocol for detecting thresh old crossings of aggregates called TCA-GAP. The protocols have been evaluated against the design goals through three complementing evaluation methods: theoretical analysis, simulation study and testbed implementation.

Place, publisher, year, edition, pages
Stockholm: KTH , 2007. , p. 107
Series
Trita-EE, ISSN 1653-5146 ; 2007:065
National Category
Telecommunications
Identifiers
URN: urn:nbn:se:kth:diva-4646ISBN: 978-91-7178-774-3 (print)OAI: oai:DiVA.org:kth-4646DiVA, id: diva2:13225
Presentation
2007-12-04, Q22, KTH, Osquldas väg 6, Stockholm, 10:00
Opponent
Supervisors
Note
QC 20101122Available from: 2008-02-27 Created: 2008-02-27 Last updated: 2022-09-13Bibliographically approved
List of papers
1. Decentralized computation of threshold crossing alerts
Open this publication in new window or tab >>Decentralized computation of threshold crossing alerts
2005 (English)In: IFIP/IEEE International Workshop on Distributed Systems: Operations and Management, Berlin: Springer-Verlag , 2005, Vol. LNCS 3775, p. 220-232Conference paper, Published paper (Refereed)
Abstract [en]

Threshold crossing alerts (TCAs) indicate to a management system that a management variable, associated with the state, performance or health of the network, has crossed a certain threshold. The timely detection of TCAs is essential to proactive management. This paper focuses on detecting TCAs for network-level variables, which are computed from device-level variables using aggregation functions, such as SUM, MAX, or AVERAGE. It introduces TCA-GAP, a novel protocol for producing network-wide TCAs in a scalable and robust manner. The protocol maintains a spanning tree and uses local thresholds, which adapt to changes in network state and topology, by allowing nodes to trade unused “threshold space”. Scalability is achieved through computing the thresholds locally and through distributing the aggregation process across all nodes. Faulttolerance is achieved by a mechanism that reconstructs the spanning tree after node addition, removal or failure. Simulation results on an ISP topology show that the protocol successfully concentrates traffic overhead to periods where the aggregate is close to the given threshold.

Place, publisher, year, edition, pages
Berlin: Springer-Verlag, 2005
Series
LECTURE NOTES IN COMPUTER SCIENCE, ISSN 0302-9743 ; 3775
National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-26202 (URN)10.1007/11568285_19 (DOI)000233789600019 ()2-s2.0-33646749501 (Scopus ID)3-540-29388-4 (ISBN)
Conference
16th IFIP/IEEE International Workshop on Distributed Systems - Operations and Management (DSOM) Barcelona, SPAIN, OCT 24-26, 2005
Note
QC 20101122Available from: 2010-11-21 Created: 2010-11-21 Last updated: 2022-09-13Bibliographically approved
2. Implementation and Evaluation of a Protocol for Detecting Network-Wide Threshold Crossing Alerts
Open this publication in new window or tab >>Implementation and Evaluation of a Protocol for Detecting Network-Wide Threshold Crossing Alerts
2006 (English)In: E2EMON 06: 4th IEEE/IFIP Workshop on End-to-End Monitoring Techniques and Services - REAL-TIME MONITORING OF INTERNET PATHS / [ed] AlShaer E, Pras A, Brownlee N, New York: IEEE , 2006, Vol. 4, p. 42-49Conference paper, Published paper (Refereed)
Abstract [en]

Threshold crossing alerts (TCAs) indicate to a management system that a management variable, associated with the state, performance or health of the network, has crossed a certain threshold. In this paper, we report on implementing and evaluating TCA-GAP, a distributed protocol for detecting network-wide TCAs, which reports threshold crossings on aggregates, such as SUM, AVERAGE, or MAX of device counters. We present a concept for assessing the quality of detecting network-wide TCAs, which we apply to evaluate TCA-GAP on a lab testbed. First, we evaluate the correctness of the protocol by determining the correctly detected threshold crossings, the false positives and the false negatives. Second, for the correctly detected threshold crossings, we measure the delays between the time a crossing was reported by the protocol and the time of its actual occurrence. Finally, we demonstrate that the fundamental tradeoff between the quality of TCA detection and the management overhead can be controlled in TCA-GAP by modifying the maximum message rate on the management overlay.

Place, publisher, year, edition, pages
New York: IEEE, 2006
Keywords
threshold defection, decentralized management, distributed aggregation
National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-26204 (URN)10.1109/E2EMON.2006.1651278 (DOI)000238288300006 ()2-s2.0-33947642088 (Scopus ID)1-4244-0145-3 (ISBN)
Conference
4th IEEE/IFIP Workshop on End-to-End Monitoring Techniques and Services Vancouver, CANADA, APR 03, 2006
Note
QC 20101122Available from: 2010-11-21 Created: 2010-11-21 Last updated: 2022-09-13Bibliographically approved
3. Decentralized service-level monitoring using network threshold crossing alerts
Open this publication in new window or tab >>Decentralized service-level monitoring using network threshold crossing alerts
2006 (English)In: IEEE Communications Magazine, ISSN 0163-6804, E-ISSN 1558-1896, Vol. 44, no 10, p. 70-76Article in journal (Refereed) Published
Abstract [en]

Service level agreements are at the core of the business relationship between service providers and their customers. Service level monitoring is necessary to validate and ensure that service levels are indeed adhered to. One key tool for this is threshold crossing alerts. TCAs can notify a service provider that a certain parameter has exceeded a certain threshold value, directing attention to those areas where preventive action needs to be taken. This article presents a protocol and an architecture that implements a new category of TCAs for parameters that need to be aggregated across a network, as opposed to parameters that can be observed from a single device; for example, "average utilization across all links in a network" as opposed to "utilization of a particular link." Although highly useful, such parameters are rarely subjected to SLAs today, due to the lack of effective monitoring technology. Our system fills this gap. It does so in a manner that is decentralized inside the network and does not rely on a more traditional centralized management architecture. We focus on business application aspects, as well as the robustness and accuracy properties of our system.

National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-16042 (URN)10.1109/MCOM.2006.1710415 (DOI)000241126900006 ()2-s2.0-33750599056 (Scopus ID)
Note
QC 20100525Available from: 2010-08-05 Created: 2010-08-05 Last updated: 2022-09-13Bibliographically approved
4. Robust Monitoring of Network-wide Aggregates through Gossiping
Open this publication in new window or tab >>Robust Monitoring of Network-wide Aggregates through Gossiping
2007 (English)In: IFIP/IEEE International Symposium on Integrated Network Management (IM 2009): VOLS 1 AND 2, New York: IEEE , 2007, p. 226-235Conference paper, Published paper (Refereed)
Abstract [en]

We examine the use of gossip protocols for continuous monitoring of network-wide aggregates. Aggregates are computed from local management variables using functions such as AVERAGE, MIN, MAX, or SUM. A particular challenge is to develop a gossip-based aggregation protocol that is robust against node failures. In this paper, we present G-GAP, a gossip protocol for continuous monitoring of aggregates, which is robust against discontiguous failures (i.e., under the constraint that neighboring nodes do not fail within a short period of each other). We formally prove this property, and we evaluate the protocol through simulation using real traces. The simulation results suggest that the design goals for this protocol have been met. For instance, the tradeoff between estimation accuracy and protocol overhead can be controlled, and a high estimation accuracy (below some 5% error in our measurements) is achieved by the protocol, even for large networks and frequent node failures. Further, we perform a comparative assessment of G-GAP against a tree-based aggregation protocol using simulation. Surprisingly, we find that the tree-based aggregation protocol consistently outperforms the gossip protocol for comparative overhead, both in terms of accuracy and robustness.

Place, publisher, year, edition, pages
New York: IEEE, 2007
Keywords
Access protocols, Aggregates, Condition monitoring, Counting circuits, Error correction, Fault tolerant systems, Real time systems, Robustness, Surveillance, Traffic control
National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-26201 (URN)10.1109/INM.2007.374787 (DOI)000250405400024 ()2-s2.0-34748835423 (Scopus ID)978-1-4244-0798-9 (ISBN)
Conference
10th IFIP/IEEE International Symposium on Integrated Network Management Munich, GERMANY, MAY 21-25, 2007
Note
Book Group Author(s): IEEEAvailable from: 2010-11-21 Created: 2010-11-21 Last updated: 2022-09-13Bibliographically approved
5. Decentralized detection of global threshold crossings using aggregation trees
Open this publication in new window or tab >>Decentralized detection of global threshold crossings using aggregation trees
2008 (English)In: Computer Networks, ISSN 1389-1286, E-ISSN 1872-7069, Vol. 52, no 9, p. 1745-1761Article in journal (Refereed) Published
Abstract [en]

The timely detection that a monitored variable has crossed a given threshold is a fundamental requirement for many network management applications. A challenge is the detection of threshold crossing of network-wide variables, which are computed from device counters across the network, using aggregation functions such as SUM, MAX and AVERAGE. This paper contains a detailed description and a comprehensive evaluation of TCA-GAP, a protocol for detecting threshold crossings of network-wide aggregates in a distributed way. Elements of its design include tree-based incremental aggregation for estimating the value of aggregates, a local hysteresis mechanism to reduce overhead and dynamic recomputation of local thresholds to ensure correctness. The protocol is evaluated through extensive simulation using real traces in scenarios with network sizes up to 5232 nodes. From the measurements, we conclude that the protocol is efficient in the sense that the overhead is negligible when the aggregate is far from the threshold. It is scalable as the protocol overhead is independent of the system size for the network sizes and scenario configurations considered. We demonstrate that the local hysteresis parameter can be used to control the tradeoff between protocol overhead and detection delay. We further report on results on how node failures impact overhead and detection quality of the protocol.

Place, publisher, year, edition, pages
Elsevier, 2008
Keywords
decentralized network management, threshold crossing alerts, real-time, monitoring, tree-based aggregation protocols
National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-17634 (URN)10.1016/j.comnet.2008.02.015 (DOI)000257012600006 ()2-s2.0-43449096331 (Scopus ID)
Note
NOTICE: this is the author’s version of a work that was accepted for publication in . Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in PUBLICATION, VOL 52, ISSUE 9, 2008, DOI 10.1016/j.comnet.2008.02.015 QC 20100525 QC 20120213Available from: 2012-02-13 Created: 2010-08-05 Last updated: 2022-09-13Bibliographically approved

Open Access in DiVA

fulltext(2496 kB)623 downloads
File information
File name FULLTEXT01.pdfFile size 2496 kBChecksum MD5
28038936018197bcf96ca5328d9f4da67401e0d436b9ac9fe42389d96c12c3ba3fd4c70f
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Wuhib, Fetahi Zebenigus
By organisation
Communication Networks
Telecommunications

Search outside of DiVA

GoogleGoogle Scholar
Total: 623 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1043 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf