The Annals of Applied Probability

More on recurrence and waiting times

Abraham J. Wyner

Full-text: Open access


Let $\mathbf{X} = {X_n: n = 1, 2,\dots}$ be a discrete valued stationary ergodic process distributed according to probability P. Let $\mathbf{Z}_1^n = {Z_1, Z_2,\dots, Z_n}$ be an independent realization of an n-block drawn with the same probability as X. We consider the waiting time $W_n$ defined as the first time the n-block $\mathbf{Z}_1^n$ appears in X. There are many recent results concerning this waiting time that demonstrate asymptotic properties of this random variable. In this paper, we prove that for all n the random variable $W_nP(Z_1^n)$ is approximately distributed as an exponential random variable with mean 1. We use a Poisson heuristic to provide a very simple intuition for this result, which is then formalized using the Chen-Stein method. We then rederive, with remarkable brevity, most of the known asymptotic results concerning $W_n$ and prove others as well. We further establish the surprising fact that for many sources $W_nP(\mathbf{Z}_1^n)$ is exp(1) even if the probability law for Z is not the same as that of X. We also consider the d-dimensional analog of the waiting time and prove a similar result in that setting. Nearly identical results are then derived for the recurrence time $R_n$ defined as the first time the initial N-block $\mathbf{X}_1^n$ reappears in X.

We conclude by developing applications of these results to provide concise solutions to problems that stem from the analysis of the Lempel-Ziv data compression algorithm. We also consider possible applications to DNA sequence analysis.

Article information

Ann. Appl. Probab., Volume 9, Number 3 (1999), 780-796.

First available in Project Euclid: 21 August 2002

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 60G07: General theory of processes

Recurrence times entropy pattern matching Chen-Stein method data compression Lempel-Ziv algorithm


Wyner, Abraham J. More on recurrence and waiting times. Ann. Appl. Probab. 9 (1999), no. 3, 780--796. doi:10.1214/aoap/1029962813.

Export citation


  • 1 BARBOUR, A., HOLST, S. AND JANSON, L. 1992. Poisson Approximation. Oxford Univ. Press.
  • 2 FARACH, M., NOORDEWEIR, M., SAVARI, S., SHEPP, L., Wy NER, A. J. and ZIV, J. 1995. The entropy of DNA: algorithms and measurements based on memory and rapid convergence. In Proceedings of the Sixth Annual ACM-SIAM Sy mposium on Discrete Algorithms 48 57. ACM, New York.
  • 3 JACQUET, P. and SZPANKOWSKI, W. 1994. Autocorrelation on words and its applications: analysis of suffix trees by string-ruler approach. J. Combin. Theory 66 237 269.
  • 4 KONTOy IANNIS, I. 1998. Asy mptotic recurrence and waiting times for stationary processes. J. Theoret. Probab. 11 795 811.
  • 5 ORNSTEIN, D. S., and WEISS, B. 1993. Entropy and data compression schemes. IEEE Trans. Inform. Theory 39 78 83.
  • 6 ORNSTEIN, D. S. and WEISS, B. 1994. Entropy and recurrence rates for stationary random fields. Unpublished manuscript.
  • 7 SHIELDS, P. C. 1992. Entropy and prefixes. Ann. Probab. 20 403 409.
  • 8 SHIELDS, P. C. 1993. Waiting times: positive and negative results on the Wy ner Ziv problem. J. Theoret. Probab. 6 499 519.
  • 9 SHIELDS, P. C. 1996. The Ergodic Theory of Discrete Sample Paths. Amer. Math. Soc., Providence, RI.
  • 10 Wy NER, A. D. and ZIV, J. 1989. Some asy mptotic properties of the entropy of a stationary ergodic source with applications to data compression. IEEE Trans. Inform. Theory 35 1250 1258.
  • 11 Wy NER, A. D., ZIV, J. and Wy NER, A. J. 1999. On the role of pattern matching in information theory. IEEE Trans. Inform. Theory 44 2045 2056.
  • 12 Wy NER, A. J. 1993. String matching theorems and applications to data compression and statistics. Ph.D. dissertation, Stanford Univ.
  • 13 Wy NER, A. J. 1996. Entropy estimation and patterns. In Proceedings of the 1996 ISIT Workshop, Haifa, Israel.
  • 14 Wy NER, A. J. 1997. The redundancy and distribution of phrase lengths of the FDLZ algorithm. IEEE Trans. Inform. Theory 43 1452 1464.