Optimization of two-granularity software rejuvenation policy based on the markov regenerative process

Thumbnail Image



Journal Title

Journal ISSN

Volume Title

Repository Usage Stats


Citation Stats


© 1963-2012 IEEE. Software rejuvenation is a proactive software control technique that is used to improve a computing system performance when it suffers from software aging. In this paper, a two-granularity inspection-based software rejuvenation policy, which works as a closed-loop control technique, is proposed. This policy mitigates the negative impact of two-level software aging. The two levels considered are the user-level applications and the operating system. A Markov regenerative process model is constructed based on the system condition. We obtain the degradation rate of the application software and operating system from fault injection experiments. The diagnostic accuracy of the adopted monitor and analysis system, which is applied to inspect the application software and operating system, is considered as we provide the optimal rejuvenation strategies. Finally, the availability and the overall loss probability with their corresponding optimal inspection time intervals are obtained numerically based on the parameter values estimated from the experiments. Experimental results show that two-granularity software rejuvenation is much more effective than traditional single-level software rejuvenation. In our experi-mental study, when two-granularity software rejuvenation is used, the unavailability and the overall loss probability of the system were reduced by 17.9% and 2.65%, respectively, in comparison with the single-level rejuvenation.






Published Version (Please cite this version)


Publication Info

Ning, G, J Zhao, Y Lou, J Alonso, R Matias, KS Trivedi, BB Yin, KY Cai, et al. (2016). Optimization of two-granularity software rejuvenation policy based on the markov regenerative process. IEEE Transactions on Reliability, 65(4). pp. 1630–1646. 10.1109/TR.2016.2570539 Retrieved from https://hdl.handle.net/10161/15574.

This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.



Kishor S. Trivedi

Hudson Distinguished Professor of Electrical and Computer Engineering

Kishor Trivedi holds the Hudson Chair in the Department of Electrical and Computer Engineering at Duke University. He is known as a leading international expert in the domain of reliability and performability evaluation of Dependable systems, and has made seminal contributions to stochastic modeling formalisms and their efficient solution. He is currently carrying out experimental research in software reliability during operation where he is researching software fault tolerance through environmental diversity. This work, including software bug classification, empirical study of real failure data and associated theory of affordable software fault tolerance, has already gained significant attention.

He has made key contributions to his field in many ways. He has encapsulated developed algorithms into usable and well circulated software packages, and applied research results to practical problems working directly with industry. This work has not only been able to solve difficult real-life problems but also produce new research based on the problems. He has published over 600 articles, has supervised 48 Ph.D. dissertations and more than 50 postdoctoral associates. He is a Life Fellow of the Institute of Electrical and Electronics Engineers and a Golden Core Member of IEEE Computer Society. He has served on many editorial boards and conference committees and is the recipient of IEEE Computer Society Technical Achievement Award for his research on Software Aging and Rejuvenation. He is the recipient of IEEE Reliability Society's Life Time Achievement Award. He is on ISI’s highly cited list with an h-index of 108, and has received grants from such governmental agencies as NASA, NATO, NSF, DARPA, AFOSR, ARO, NIH, NSWC, ONR, and RADC.

Trivedi has also helped several high-profile companies carry out reliability/availability prediction of their products under design or in existence, including 3Com, Avaya, Boeing, Cisco, DEC, EMC, GE, HP, Huawei, IBM, Lucent, NEC, TCS, Union Switch and Signals, Wipro. Most notable among these, is his help in reliability modeling of the current return network subsystem of the Boeing 787 for FAA certification. The algorithm he developed for this problem has been jointly patented by Boeing and Trivedi. He led the reliability/availability modeling of SIP on IBM WebSphere; this model was responsible for the sale of the system by IBM to AT & T. 

Furthermore, Trivedi has written several influential books, including textbooks. He is the author of a well-known text entitled, Probability and Statistics with Reliability, Queuing and Computer Science Applications, originally published by Prentice-Hall; a thoroughly revised second edition (including its Indian edition) has been published by John Wiley. This book is translated into Chinese in Nov. 2015. This book has  appeared as a paperback in July 2016.He has also published two other books titled, Performance and Reliability Analysis of Computer Systems, published by Springer and Queueing Networks and Markov Chains by John Wiley. His latest book, Reliability and Availability Engineering: Modeling, Analysis and Applications, is published by Cambridge University Press in 2017.

Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.