Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A design methodology for self-management in distributed environments
KTH, School of Information and Communication Technology (ICT), Communication: Services and Infrastucture, Software and Computer Systems, SCS.
KTH, School of Information and Communication Technology (ICT), Communication: Services and Infrastucture, Software and Computer Systems, SCS.
Swedish Institute of Computer Science.
Swedish Institute of Computer Science.
2009 (English)In: IEEE International conference on Computational Science and Engineering, 2009, 430-436 p.Conference paper, Published paper (Refereed)
Abstract [en]

  Autonomic computing is a paradigm that aims at reducing administrative overhead by providing autonomic managers to make applications selfmanaging. In order to better deal with dynamic environments, for improved performance and scalability, we advocate for distribution of management functions among several cooperative managers that coordinate their activities in order to achieve management objectives. We present a methodology for designing the management part of a distributed self-managing application in a distributed manner. We define design steps, that includes partitioning of management functions and orchestration of multiple autonomic managers. We illustrate the proposed design methodology by applying it to design and development of a distributed storage service as a case study. The storage service prototype has been developed using the distributing component management system Niche. Distribution of autonomic managers allows distributing the management overhead and increased management performance due to concurrency and better locality.

Place, publisher, year, edition, pages
2009. 430-436 p.
Keyword [en]
autonomic computing, control loops, distributed systems, selfmanagement, component management system, design and development, design methodology, design steps, distributed environments, distributed storage, dynamic environments, management functions, management objectives, self management, self-managing, storage services, computer science, design, distribution functions, large scale systems, light measurments, managers, model checking, remote control, management
National Category
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-12957DOI: 10.1109/CSE.2009.301Scopus ID: 2-s2.0-70749096986ISBN: 9780760538235 (print)OAI: oai:DiVA.org:kth-12957DiVA: diva2:319887
Note
QC 20100520Available from: 2010-05-20 Created: 2010-05-20 Last updated: 2012-08-31Bibliographically approved
In thesis
1. Enabling and Achieving Self-Management for Large Scale Distributed Systems: Platform and Design Methodology for Self-Management
Open this publication in new window or tab >>Enabling and Achieving Self-Management for Large Scale Distributed Systems: Platform and Design Methodology for Self-Management
2010 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Autonomic computing is a paradigm that aims at reducing administrative overhead by using autonomic managers to make applications self-managing. To better deal with large-scale dynamic environments; and to improve scalability, robustness, and performance; we advocate for distribution of management functions among several cooperative autonomic managers that coordinate their activities in order to achieve management objectives. Programming autonomic management in turn requires programming environment support and higher level abstractions to become feasible.

In this thesis we present an introductory part and a number of papers that summaries our work in the area of autonomic computing. We focus on enabling and achieving self-management for large scale and/or dynamic distributed applications. We start by presenting our platform, called Niche, for programming self-managing component-based distributed applications. Niche supports a network-transparent view of system architecture simplifying designing application self-* code.  Niche provides a concise and expressive API for self-* code. The implementation of the framework relies on scalability and robustness of structured overlay networks. We have also developed a distributed file storage service, called YASS, to illustrate and evaluate Niche.

After introducing Niche we proceed by presenting a methodology and design space for designing the management part of a distributed self-managing application in a distributed manner. We define design steps, that includes partitioning of management functions and orchestration of multiple autonomic managers. We illustrate the proposed design methodology by applying it to the design and development of an improved version of our distributed storage service YASS as a case study.

We continue by presenting a generic policy-based management framework which has been integrated into Niche. Policies are sets of rules that govern the system behaviors and reflect the business goals or system management objectives. The policy based management is introduced to simplify the management and reduce the overhead, by setting up policies to govern system behaviors. A prototype of the framework is presented and two generic policy languages (policy engines and corresponding APIs), namely SPL and XACML, are evaluated using our self-managing file storage application YASS as a case study.

Finally, we present a generic approach to achieve robust services that is based on finite state machine replication with dynamic reconfiguration of replica sets. We contribute a decentralized algorithm that maintains the set of resource hosting service replicas in the presence of churn. We use this approach to implement robust management elements as robust services that can operate despite of churn.

 

Place, publisher, year, edition, pages
Stockholm: Universitetsservice US AB, 2010. 42 p.
Series
Trita-ICT-ECS AVH, ISSN 1653-6363 ; 10:01
Keyword
Autonomic Computing, Self-Management, Distributed Systems
National Category
Computer Science
Identifiers
urn:nbn:se:kth:diva-12377 (URN)978-91-7415-589-1 (ISBN)
Presentation
2010-04-09, Sal D, Isajordsgatan 39, Kista, Sweden, Forum IT-Universitetet, KTH, 14:00 (English)
Opponent
Supervisors
Note
QC 20100520Available from: 2010-05-20 Created: 2010-04-13 Last updated: 2012-02-22Bibliographically approved
2. Self-Management for Large-Scale Distributed Systems
Open this publication in new window or tab >>Self-Management for Large-Scale Distributed Systems
2012 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Autonomic computing aims at making computing systems self-managing by using autonomic managers in order to reduce obstacles caused by management complexity. This thesis presents results of research on self-management for large-scale distributed systems. This research was motivated by the increasing complexity of computing systems and their management.

In the first part, we present our platform, called Niche, for programming self-managing component-based distributed applications. In our work on Niche, we have faced and addressed the following four challenges in achieving self-management in a dynamic environment characterized by volatile resources and high churn: resource discovery, robust and efficient sensing and actuation, management bottleneck, and scale. We present results of our research on addressing the above challenges. Niche implements the autonomic computing architecture, proposed by IBM, in a fully decentralized way. Niche supports a network-transparent view of the system architecture simplifying the design of distributed self-management. Niche provides a concise and expressive API for self-management. The implementation of the platform relies on the scalability and robustness of structured overlay networks. We proceed by presenting a methodology for designing the management part of a distributed self-managing application. We define design steps that include partitioning of management functions and orchestration of multiple autonomic managers.

In the second part, we discuss robustness of management and data consistency, which are necessary in a distributed system. Dealing with the effect of churn on management increases the complexity of the management logic and thus makes its development time consuming and error prone. We propose the abstraction of Robust Management Elements, which are able to heal themselves under continuous churn. Our approach is based on replicating a management element using finite state machine replication with a reconfigurable replica set. Our algorithm automates the reconfiguration (migration) of the replica set in order to tolerate continuous churn. For data consistency, we propose a majority-based distributed key-value store supporting multiple consistency levels that is based on a peer-to-peer network. The store enables the tradeoff between high availability and data consistency. Using majority allows avoiding potential drawbacks of a master-based consistency control, namely, a single-point of failure and a potential performance bottleneck.

In the third part, we investigate self-management for Cloud-based storage systems with the focus on elasticity control using elements of control theory and machine learning. We have conducted research on a number of different designs of an elasticity controller, including a State-Space feedback controller and a controller that combines feedback and feedforward control. We describe our experience in designing an elasticity controller for a Cloud-based key-value store using state-space model that enables to trade-off performance for cost. We describe the steps in designing an elasticity controller. We continue by presenting the design and evaluation of ElastMan, an elasticity controller for Cloud-based elastic key-value stores that combines feedforward and feedback control.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2012. xix, 266 p.
Series
TRITA-ICT-ECS AVH, ISSN 1653-6363 ; 12:04
Keyword
Self-Management, Autonomic Computing, Control Theory, Distributed Systems, Grid Computing, Cloud Computing, Elastic Services, Key-Value Stores
National Category
Computer Systems
Research subject
SRA - ICT
Identifiers
urn:nbn:se:kth:diva-101661 (URN)978-91-7501-437-1 (ISBN)
Public defence
2012-09-26, Sal E, Forum IT-Universitetet, KTH, Isajordsgatan 39, Kista, 14:00 (English)
Opponent
Supervisors
Funder
ICT - The Next Generation
Note

QC 20120831

Available from: 2012-08-31 Created: 2012-08-30 Last updated: 2014-01-23Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Al-Shishtawy, AhmadVlassov, Vladimir
By organisation
Software and Computer Systems, SCS
Computer Science

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 201 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf