Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Enabling and Achieving Self-Management for Large Scale Distributed Systems: Platform and Design Methodology for Self-Management
KTH, School of Information and Communication Technology (ICT), Communication: Services and Infrastucture, Software and Computer Systems, SCS.
2010 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Autonomic computing is a paradigm that aims at reducing administrative overhead by using autonomic managers to make applications self-managing. To better deal with large-scale dynamic environments; and to improve scalability, robustness, and performance; we advocate for distribution of management functions among several cooperative autonomic managers that coordinate their activities in order to achieve management objectives. Programming autonomic management in turn requires programming environment support and higher level abstractions to become feasible.

In this thesis we present an introductory part and a number of papers that summaries our work in the area of autonomic computing. We focus on enabling and achieving self-management for large scale and/or dynamic distributed applications. We start by presenting our platform, called Niche, for programming self-managing component-based distributed applications. Niche supports a network-transparent view of system architecture simplifying designing application self-* code.  Niche provides a concise and expressive API for self-* code. The implementation of the framework relies on scalability and robustness of structured overlay networks. We have also developed a distributed file storage service, called YASS, to illustrate and evaluate Niche.

After introducing Niche we proceed by presenting a methodology and design space for designing the management part of a distributed self-managing application in a distributed manner. We define design steps, that includes partitioning of management functions and orchestration of multiple autonomic managers. We illustrate the proposed design methodology by applying it to the design and development of an improved version of our distributed storage service YASS as a case study.

We continue by presenting a generic policy-based management framework which has been integrated into Niche. Policies are sets of rules that govern the system behaviors and reflect the business goals or system management objectives. The policy based management is introduced to simplify the management and reduce the overhead, by setting up policies to govern system behaviors. A prototype of the framework is presented and two generic policy languages (policy engines and corresponding APIs), namely SPL and XACML, are evaluated using our self-managing file storage application YASS as a case study.

Finally, we present a generic approach to achieve robust services that is based on finite state machine replication with dynamic reconfiguration of replica sets. We contribute a decentralized algorithm that maintains the set of resource hosting service replicas in the presence of churn. We use this approach to implement robust management elements as robust services that can operate despite of churn.

 

Place, publisher, year, edition, pages
Stockholm: Universitetsservice US AB , 2010. , 42 p.
Series
Trita-ICT-ECS AVH, ISSN 1653-6363 ; 10:01
Keyword [en]
Autonomic Computing, Self-Management, Distributed Systems
National Category
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-12377ISBN: 978-91-7415-589-1 (print)OAI: oai:DiVA.org:kth-12377DiVA: diva2:310497
Presentation
2010-04-09, Sal D, Isajordsgatan 39, Kista, Sweden, Forum IT-Universitetet, KTH, 14:00 (English)
Opponent
Supervisors
Note
QC 20100520Available from: 2010-05-20 Created: 2010-04-13 Last updated: 2012-02-22Bibliographically approved
List of papers
1. Enabling Self-Management Of Component Based Distributed Applications
Open this publication in new window or tab >>Enabling Self-Management Of Component Based Distributed Applications
Show others...
2008 (English)In: FROM GRIDS TO SERVICE AND PERVASIVE COMPUTING, Springer-Verlag New York, 2008, 163-174 p.Conference paper, Published paper (Refereed)
Abstract [en]

Deploying and managing distributed applications in dynamic Grid environments requires a high degree of autonomous management. Programming autonomous management in turn requires programming environment support and higher level abstractions to become feasible. We present a framework for programming self-managing component-based distributed applications. The framework enables the separation of application’s functional and non-functional (self-*) parts. The framework extends the Fractal component model by the component group abstraction and one-to-any and one-to-all bindings between components and groups. The framework supports a network-transparent view of system architecture simplifying designing application self-* code. The framework provides a concise and expressive API for self-* code. The implementation of the framework relies on scalability and robustness of the Niche structured p2p overlay network. We have also developed a distributed file storage service to illustrate and evaluate our framework.

Place, publisher, year, edition, pages
Springer-Verlag New York, 2008
Series
CoreGRID
Keyword
self-management, autonomic computing, component-based applications, P2P, Grid
National Category
Computer Science
Identifiers
urn:nbn:se:kth:diva-12956 (URN)10.1007/978-0-387-09455-7_12 (DOI)000259036400012 ()978-0-387-09455-7 (ISBN)
Conference
10th CoreGRID Symposium 2008, Canary Isl, SPAIN, AUG 25-26, 2008
Projects
FP6 EU project Grid4All (Contract IST-2006-034567)FP6 Network of Excellence CoreGRID (Contract IST-2002-004265)
Note
QC 20100520 VV 20111221Available from: 2010-05-20 Created: 2010-05-20 Last updated: 2012-08-31Bibliographically approved
2. A design methodology for self-management in distributed environments
Open this publication in new window or tab >>A design methodology for self-management in distributed environments
2009 (English)In: IEEE International conference on Computational Science and Engineering, 2009, 430-436 p.Conference paper, Published paper (Refereed)
Abstract [en]

  Autonomic computing is a paradigm that aims at reducing administrative overhead by providing autonomic managers to make applications selfmanaging. In order to better deal with dynamic environments, for improved performance and scalability, we advocate for distribution of management functions among several cooperative managers that coordinate their activities in order to achieve management objectives. We present a methodology for designing the management part of a distributed self-managing application in a distributed manner. We define design steps, that includes partitioning of management functions and orchestration of multiple autonomic managers. We illustrate the proposed design methodology by applying it to design and development of a distributed storage service as a case study. The storage service prototype has been developed using the distributing component management system Niche. Distribution of autonomic managers allows distributing the management overhead and increased management performance due to concurrency and better locality.

Keyword
autonomic computing, control loops, distributed systems, selfmanagement, component management system, design and development, design methodology, design steps, distributed environments, distributed storage, dynamic environments, management functions, management objectives, self management, self-managing, storage services, computer science, design, distribution functions, large scale systems, light measurments, managers, model checking, remote control, management
National Category
Computer Science
Identifiers
urn:nbn:se:kth:diva-12957 (URN)10.1109/CSE.2009.301 (DOI)2-s2.0-70749096986 (Scopus ID)9780760538235 (ISBN)
Note
QC 20100520Available from: 2010-05-20 Created: 2010-05-20 Last updated: 2012-08-31Bibliographically approved
3. Policy based self-management in distributed environments
Open this publication in new window or tab >>Policy based self-management in distributed environments
2010 (English)In: 2010 Fourth IEEE International Conference on Self-Adaptive and Self-Organizing Systems Workshop (SASOW), IEEE Computer Society Digital Library, 2010, 256-260 p.Conference paper, Published paper (Refereed)
Abstract [en]

  Currently, increasing costs and escalating complexities are primary issues in the distributed system management. The policy based management is introduced to simplify the management and reduce the overhead, by setting up policies to govern system behaviors. Policies are sets of rules that govern the system behaviors and reflect the business goals or system management objectives. This paper presents a generic policy-based management framework which has been integrated into an existing distributed component management system, called Niche, that enables and supports self-management. In this framework, programmers can set up more than one Policy-Manager-Group to avoid centralized policy decision making which could become a performance bottleneck. Furthermore, the size of a Policy-Manager-Group, i.e. the number of Policy-Managers in the group, depends on their load, i.e. the number of requests per time unit. In order to achieve good load balancing, a policy request is delivered to one of the policy managers in the group randomly chosen on the fly. A prototype of the framework is presented and two generic policy languages (policy engines and corresponding APIs), namely SPL and XACML, are evaluated using a self-managing file storage application as a case study.

Place, publisher, year, edition, pages
IEEE Computer Society Digital Library, 2010
National Category
Computer Science
Identifiers
urn:nbn:se:kth:diva-12958 (URN)10.1109/SASOW.2010.72 (DOI)2-s2.0-79953144963 (Scopus ID)978-1-4244-8684-7 (ISBN)
Conference
Fourth IEEE International Conference on Self-Adaptive and Self-Organizing Systems Workshop (SASOW)
Projects
FP6 project Grid4All (contract IST-2006-034567) funded by the European Commission
Note
QC 20100520 VV 20111211Available from: 2010-05-20 Created: 2010-05-20 Last updated: 2011-12-27Bibliographically approved
4. Achieving robust self-management for large-scale distributed applications
Open this publication in new window or tab >>Achieving robust self-management for large-scale distributed applications
2010 (English)Report (Other (popular science, discussion, etc.))
Abstract [en]

Autonomic managers are the main architectural building blocks for constructing self-management capabilities of computing systems and applications. One of the major challenges in developing self-managing applications is robustness of management elements which form autonomic managers. We believe that transparent handling of the effects of resource churn (joins/leaves/failures) on management should be an essential feature of a platform for selfmanaging large-scale dynamic distributed applications, because it facilitates the development of robust autonomic managers and hence improves robustness of self-managing applications. This feature can be achieved by providing a robust management element abstraction that hides churn from the programmer. In this paper, we present a generic approach to achieve robust services that is based on finite state machine replication with dynamic reconfiguration of replica sets. We contribute a decentralized algorithm that maintains the set of nodes hosting service replicas in the presence of churn. We use this approach to implement robust management elements as robust services that can operate despite of churn. Our proposed decentralized algorithm uses peer-to-peer replica placement schemes to automate replicated state machine migration in order to tolerate churn. Our algorithm exploits lookup and failure detection facilities of a structured overlay network for managing the set of active replicas. Using the proposed approach, we can achieve a long running and highly available service, without human intervention, in the presence of resource churn. In order to validate and evaluate our approach, we have implemented a prototype that includes the proposed algorithm.

 

Series
SICS Technical Report T2010:02, ISSN 1100-3154
National Category
Computer Science
Identifiers
urn:nbn:se:kth:diva-12959 (URN)
Funder
ICT - The Next Generation
Note
QC 20100520Available from: 2010-05-20 Created: 2010-05-20 Last updated: 2012-06-13Bibliographically approved

Open Access in DiVA

fulltext(2485 kB)1031 downloads
File information
File name FULLTEXT01.pdfFile size 2485 kBChecksum SHA-512
5b0cad7a3e1c5a31f5926d52fb149a6ad6982dcad76bb26823cd8282a37d9c734d2c979163108594771efb11af8030b4cb4a3ce42e48143e503a3bd850b9e1dd
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Al-Shishtawy, Ahmad
By organisation
Software and Computer Systems, SCS
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 1031 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 429 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf