Change search
ReferencesLink to record
Permanent link

Direct link
Gossip-based Resource Management for Cloud Environments
KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
2010 (English)In: International Conference on Network and Service Management, 2010, 1-8 p.Conference paper (Refereed)
Abstract [en]

We address the problem of resource management for a large-scale cloud environment that hosts sites. Our contribution centers around outlining a distributed middleware architecture and presenting one of its key elements, a gossip protocol that meets our design goals: fairness of resource allocation with respect to hosted sites, efficient adaptation to load changes and scalability in terms of both the number of machines and sites. We formalize the resource allocation problem as that of dynamically maximizing the cloud utility under CPU and memory constraints. While we can show that an optimal solution without considering memory constraints is straightforward (but not useful), we provide an efficient heuristic solution for the complete problem instead. We evaluate the protocol through simulation and find its performance to be well-aligned with our design goals.

Place, publisher, year, edition, pages
2010. 1-8 p.
Keyword [en]
cloud computing, distributed management, resource allocation, gossip protocols
National Category
Telecommunications Computer Systems Communication Systems
URN: urn:nbn:se:kth:diva-26205DOI: 10.1109/CNSM.2010.5691347ScopusID: 2-s2.0-79951608881OAI: diva2:371512
International Conference on Network and Service Management, Niagara Falls, ON, Canada, 25-29 Oct. 2010
“© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.”QC 20120124Available from: 2010-11-21 Created: 2010-11-21 Last updated: 2012-03-12Bibliographically approved
In thesis
1. Distributed Monitoring and Resource Management for Large Cloud Environments
Open this publication in new window or tab >>Distributed Monitoring and Resource Management for Large Cloud Environments
2010 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Over the last decade, the number, size and complexity of large-scale networked systems has been growing fast, and this trend is expected to accelerate. The best known example of a large-scale networked system is probably the Internet, while large datacenters for cloud services are the most recent ones. In such environments, a key challenge is to develop scalable and adaptive technologies for management functions. This thesis addresses the challenge by engineering several protocols  for distributed monitoring and resource management that are suitable for large-scale networked systems. First, we present G-GAP, a gossip-based protocol we developed for continuous monitoring of aggregates that are computed from device variables. We prove the robustness of this protocol to node failures and validate, through simulations, that its estimation accuracy does not change with increasing size of the monitored system under certain conditions. Second, we present TCA-GAP, a tree-based protocol, and TG-GAP, a gossip-based protocol for the purpose of monitoring threshold crossings of aggregates. For both protocols, we prove correctness properties and show, again through simulations, that both protocols are efficient, by showing that their overhead is at least two orders of magnitude smaller than that of a na\"ive approach, for cases where the monitored aggregate is sufficiently far from the threshold. Third, we present a gossip-based protocol for resource management in cloud environments. The protocol allocates CPU and memory resources to sites that are hosted by the cloud. We prove that the resource allocation computed by the protocol converges exponentially fast to an optimal allocation, for cases where sufficient memory is available. Through simulations, we show that the quality of the resource allocation approaches that of an ideal system when the total memory demand decreases significantly below the memory capacity of the entire system. In addition, we validate that the quality of the allocation does not change with increasing the number of hosted sites and machines, for the case where both metrics are scaled proportionally. Finally, we compare two approaches (tree-based and gossip-based) to engineering protocols for distributed management, for the case of real-time monitoring. Results of our simulation studies indicate that, regardless of the system size and failure rates in the monitored system, gossip protocols incur a significantly larger overhead than tree-based protocols for achieving the same monitoring quality (e.g., estimation accuracy or detection delay).

Place, publisher, year, edition, pages
Stockholm: KTH, 2010. vi, 26 p.
Trita-EE, ISSN 1653-5146 ; 2010:051
decentralized management, engineering protocols, distributed monitoring, resource management
National Category
Telecommunications Computer Science
urn:nbn:se:kth:diva-26207 (URN)978-91-7415-794-9 (ISBN)
Public defence
2010-12-10, Q2, Osquldas väg 10, plan 2, KTH, Stockholm, 14:00 (English)
QC 20101124Available from: 2010-11-24 Created: 2010-11-21 Last updated: 2012-03-22Bibliographically approved

Open Access in DiVA

fulltext(454 kB)1016 downloads
File information
File name FULLTEXT01.pdfFile size 454 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Wuhib, FetahiStadler, Rolf
By organisation
ACCESS Linnaeus Centre
TelecommunicationsComputer SystemsCommunication Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 1016 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 135 hits
ReferencesLink to record
Permanent link

Direct link