Journal of Applied Probability

Checkpointing for the RESTART problem in Markov networks

Lester Lipsky, Derek Doran, and Swapna Gokhale

Abstract

We apply the known formulae of the RESTART problem to Markov models of software (and many other) systems, and derive new equations. We show how checkpoints might be included, with their resultant performance under RESTART. The result is a complete procedure for finding the mean, variance, and tail behavior of the job completion time as a function of the failure rate. We also provide a detailed example.

Article information

Source
J. Appl. Probab., Volume 48A (2011), 195-207.

Dates
First available in Project Euclid: 18 October 2011

Permanent link to this document
https://projecteuclid.org/euclid.jap/1318940465

Digital Object Identifier
doi:10.1239/jap/1318940465

Mathematical Reviews number (MathSciNet)
MR2865626

Zentralblatt MATH identifier
1242.60080

Subjects
Primary: 60J28: Applications of continuous-time Markov processes on discrete state spaces
Secondary: 60K10: Applications (reliability, demand theory, etc.)

Keywords
Checkpoint exponential failure Markov model RESTART performance power-tailed distribution subexponential distribution asymptotic behavior

Citation

Lipsky, Lester; Doran, Derek; Gokhale, Swapna. Checkpointing for the RESTART problem in Markov networks. J. Appl. Probab. 48A (2011), 195--207. doi:10.1239/jap/1318940465. https://projecteuclid.org/euclid.jap/1318940465


Export citation