Open Access
October 2016 Analysis of large unreliable stochastic networks
Wen Sun, Mathieu Feuillet, Philippe Robert
Ann. Appl. Probab. 26(5): 2959-3000 (October 2016). DOI: 10.1214/15-AAP1167

Abstract

In this paper, a stochastic model of a large distributed system where users’ files are duplicated on unreliable data servers is investigated. Due to a server breakdown, a copy of a file can be lost, it can be retrieved if another copy of the same file is stored on other servers. In the case where no other copy of a given file is present in the network, it is definitively lost. In order to have multiple copies of a given file, it is assumed that each server can devote a fraction of its processing capacity to duplicate files on other servers to enhance the durability of the system.

A simplified stochastic model of this network is analyzed. It is assumed that a copy of a given file is lost at some fixed rate and that the initial state is optimal: each file has the maximum number $d$ of copies located on the servers of the network. The capacity of duplication policy is used by the files with the lowest number of copies. Due to random losses, the state of the network is transient and all files will be eventually lost. As a consequence, a transient $d$-dimensional Markov process $(X(t))$ with a unique absorbing state describes the evolution this network. By taking a scaling parameter $N$ related to the number of nodes of the network, a scaling analysis of this process is developed. The asymptotic behavior of $(X(t))$ is analyzed on time scales of the type $t\mapsto N^{p}t$ for $0\leq p\leq d-1$. The paper derives asymptotic results on the decay of the network: Under a stability assumption, the main results state that the critical time scale for the decay of the system is given by $t\mapsto N^{d-1}t$. In particular, the duration of time after which a fixed fraction of files are lost is of the order of $N^{d-1}$. When the stability condition is not satisfied, that is, when it is initially overloaded, it is shown that the state of the network converges to an interesting local equilibrium which is investigated. As a consequence, it sheds some light on the role of the key parameters $\lambda$, the duplication rate and $d$, the maximal number of copies, in the design of these systems. The techniques used involve careful stochastic calculus for Poisson processes, technical estimates and the proof of a stochastic averaging principle.

Citation

Download Citation

Wen Sun. Mathieu Feuillet. Philippe Robert. "Analysis of large unreliable stochastic networks." Ann. Appl. Probab. 26 (5) 2959 - 3000, October 2016. https://doi.org/10.1214/15-AAP1167

Information

Received: 1 May 2015; Revised: 1 December 2015; Published: October 2016
First available in Project Euclid: 19 October 2016

zbMATH: 1351.60124
MathSciNet: MR3563199
Digital Object Identifier: 10.1214/15-AAP1167

Subjects:
Primary: 60F05 , 60K25
Secondary: 68M14 , 90B05

Keywords: reliability , Skorohod problem , stochastic averaging , stochastic networks with failures , Time scales , transient Markov chains with absorbing state

Rights: Copyright © 2016 Institute of Mathematical Statistics

Vol.26 • No. 5 • October 2016
Back to Top