Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Achieving robust self-management for large-scale distributed applications
KTH, School of Information and Communication Technology (ICT), Communication: Services and Infrastucture, Software and Computer Systems, SCS.
KTH, School of Information and Communication Technology (ICT), Communication: Services and Infrastucture, Software and Computer Systems, SCS.
Swedish Institute of Computer Science.
KTH, School of Information and Communication Technology (ICT), Communication: Services and Infrastucture, Software and Computer Systems, SCS.
2010 (English)Report (Other (popular science, discussion, etc.))
Abstract [en]

Autonomic managers are the main architectural building blocks for constructing self-management capabilities of computing systems and applications. One of the major challenges in developing self-managing applications is robustness of management elements which form autonomic managers. We believe that transparent handling of the effects of resource churn (joins/leaves/failures) on management should be an essential feature of a platform for selfmanaging large-scale dynamic distributed applications, because it facilitates the development of robust autonomic managers and hence improves robustness of self-managing applications. This feature can be achieved by providing a robust management element abstraction that hides churn from the programmer. In this paper, we present a generic approach to achieve robust services that is based on finite state machine replication with dynamic reconfiguration of replica sets. We contribute a decentralized algorithm that maintains the set of nodes hosting service replicas in the presence of churn. We use this approach to implement robust management elements as robust services that can operate despite of churn. Our proposed decentralized algorithm uses peer-to-peer replica placement schemes to automate replicated state machine migration in order to tolerate churn. Our algorithm exploits lookup and failure detection facilities of a structured overlay network for managing the set of active replicas. Using the proposed approach, we can achieve a long running and highly available service, without human intervention, in the presence of resource churn. In order to validate and evaluate our approach, we have implemented a prototype that includes the proposed algorithm.

 

Place, publisher, year, edition, pages
2010.
Series
SICS Technical Report T2010:02, ISSN 1100-3154
National Category
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-12959OAI: oai:DiVA.org:kth-12959DiVA: diva2:319891
Funder
ICT - The Next Generation
Note
QC 20100520Available from: 2010-05-20 Created: 2010-05-20 Last updated: 2012-06-13Bibliographically approved
In thesis
1. Enabling and Achieving Self-Management for Large Scale Distributed Systems: Platform and Design Methodology for Self-Management
Open this publication in new window or tab >>Enabling and Achieving Self-Management for Large Scale Distributed Systems: Platform and Design Methodology for Self-Management
2010 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Autonomic computing is a paradigm that aims at reducing administrative overhead by using autonomic managers to make applications self-managing. To better deal with large-scale dynamic environments; and to improve scalability, robustness, and performance; we advocate for distribution of management functions among several cooperative autonomic managers that coordinate their activities in order to achieve management objectives. Programming autonomic management in turn requires programming environment support and higher level abstractions to become feasible.

In this thesis we present an introductory part and a number of papers that summaries our work in the area of autonomic computing. We focus on enabling and achieving self-management for large scale and/or dynamic distributed applications. We start by presenting our platform, called Niche, for programming self-managing component-based distributed applications. Niche supports a network-transparent view of system architecture simplifying designing application self-* code.  Niche provides a concise and expressive API for self-* code. The implementation of the framework relies on scalability and robustness of structured overlay networks. We have also developed a distributed file storage service, called YASS, to illustrate and evaluate Niche.

After introducing Niche we proceed by presenting a methodology and design space for designing the management part of a distributed self-managing application in a distributed manner. We define design steps, that includes partitioning of management functions and orchestration of multiple autonomic managers. We illustrate the proposed design methodology by applying it to the design and development of an improved version of our distributed storage service YASS as a case study.

We continue by presenting a generic policy-based management framework which has been integrated into Niche. Policies are sets of rules that govern the system behaviors and reflect the business goals or system management objectives. The policy based management is introduced to simplify the management and reduce the overhead, by setting up policies to govern system behaviors. A prototype of the framework is presented and two generic policy languages (policy engines and corresponding APIs), namely SPL and XACML, are evaluated using our self-managing file storage application YASS as a case study.

Finally, we present a generic approach to achieve robust services that is based on finite state machine replication with dynamic reconfiguration of replica sets. We contribute a decentralized algorithm that maintains the set of resource hosting service replicas in the presence of churn. We use this approach to implement robust management elements as robust services that can operate despite of churn.

 

Place, publisher, year, edition, pages
Stockholm: Universitetsservice US AB, 2010. 42 p.
Series
Trita-ICT-ECS AVH, ISSN 1653-6363 ; 10:01
Keyword
Autonomic Computing, Self-Management, Distributed Systems
National Category
Computer Science
Identifiers
urn:nbn:se:kth:diva-12377 (URN)978-91-7415-589-1 (ISBN)
Presentation
2010-04-09, Sal D, Isajordsgatan 39, Kista, Sweden, Forum IT-Universitetet, KTH, 14:00 (English)
Opponent
Supervisors
Note
QC 20100520Available from: 2010-05-20 Created: 2010-04-13 Last updated: 2012-02-22Bibliographically approved

Open Access in DiVA

No full text

Search in DiVA

By author/editor
Al-Shishtawy, AhmadAsif Fayyaz, MuhammadVlassov, Vladimir
By organisation
Software and Computer Systems, SCS
Computer Science

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 159 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf