The Distribution of Time to Recovery of Enterprise IT Services
2014 (English)In: IEEE Transactions on Reliability, ISSN 0018-9529, Vol. 63, no 4, 858-867 p.Article in journal (Refereed) Published
The context of this article is the availability of enterprise IT services, a key concern for many enterprises. While there is a plethora of literature concerned with service availability, there is no previous systematic empirical study on IT service time to recovery following outages. The existing literature typically assumes a distribution, or builds on analogies to related areas such as software engineering. Therefore, our objective is to find the statistical distribution of IT service time to recovery. Method-wise, this investigation is based on logs of more than 1 800 incidents in a large Nordic bank, corresponding to more than 11 000 hours of recorded downtime. Five possible distributions of time to recovery from the literature were investigated using the Akaike Information Criterion to find the distribution offering the best fit. The results show that the log-normal distribution outperformed the others for all tested service channels (collections of IT services). It is concluded that the log-normal distribution offers the best fit of IT service time to recovery. Using this distribution in simulation and decision-support tools offers the prospect of better predictions of downtime and downtime costs to the practitioner community.
Place, publisher, year, edition, pages
2014. Vol. 63, no 4, 858-867 p.
Enterprise IT services, incident logs, log-normal distribution
Other Computer and Information Science
IdentifiersURN: urn:nbn:se:kth:diva-158277DOI: 10.1109/TR.2014.2336051ISI: 000345904000004ScopusID: 2-s2.0-84914106837OAI: oai:DiVA.org:kth-158277DiVA: diva2:777967
QC 201501092015-01-092015-01-072015-01-09Bibliographically approved