## The Annals of Applied Probability

### More on recurrence and waiting times

Abraham J. Wyner

#### Abstract

Let $\mathbf{X} = {X_n: n = 1, 2,\dots}$ be a discrete valued stationary ergodic process distributed according to probability P. Let $\mathbf{Z}_1^n = {Z_1, Z_2,\dots, Z_n}$ be an independent realization of an n-block drawn with the same probability as X. We consider the waiting time $W_n$ defined as the first time the n-block $\mathbf{Z}_1^n$ appears in X. There are many recent results concerning this waiting time that demonstrate asymptotic properties of this random variable. In this paper, we prove that for all n the random variable $W_nP(Z_1^n)$ is approximately distributed as an exponential random variable with mean 1. We use a Poisson heuristic to provide a very simple intuition for this result, which is then formalized using the Chen-Stein method. We then rederive, with remarkable brevity, most of the known asymptotic results concerning $W_n$ and prove others as well. We further establish the surprising fact that for many sources $W_nP(\mathbf{Z}_1^n)$ is exp(1) even if the probability law for Z is not the same as that of X. We also consider the d-dimensional analog of the waiting time and prove a similar result in that setting. Nearly identical results are then derived for the recurrence time $R_n$ defined as the first time the initial N-block $\mathbf{X}_1^n$ reappears in X.

We conclude by developing applications of these results to provide concise solutions to problems that stem from the analysis of the Lempel-Ziv data compression algorithm. We also consider possible applications to DNA sequence analysis.

#### Article information

Source
Ann. Appl. Probab., Volume 9, Number 3 (1999), 780-796.

Dates
First available in Project Euclid: 21 August 2002

https://projecteuclid.org/euclid.aoap/1029962813

Digital Object Identifier
doi:10.1214/aoap/1029962813

Mathematical Reviews number (MathSciNet)
MR1722282

Zentralblatt MATH identifier
0955.60031

Subjects
Primary: 60G07: General theory of processes

#### Citation

Wyner, Abraham J. More on recurrence and waiting times. Ann. Appl. Probab. 9 (1999), no. 3, 780--796. doi:10.1214/aoap/1029962813. https://projecteuclid.org/euclid.aoap/1029962813

#### References

• 1 BARBOUR, A., HOLST, S. AND JANSON, L. 1992. Poisson Approximation. Oxford Univ. Press.
• 2 FARACH, M., NOORDEWEIR, M., SAVARI, S., SHEPP, L., Wy NER, A. J. and ZIV, J. 1995. The entropy of DNA: algorithms and measurements based on memory and rapid convergence. In Proceedings of the Sixth Annual ACM-SIAM Sy mposium on Discrete Algorithms 48 57. ACM, New York.
• 3 JACQUET, P. and SZPANKOWSKI, W. 1994. Autocorrelation on words and its applications: analysis of suffix trees by string-ruler approach. J. Combin. Theory 66 237 269.
• 4 KONTOy IANNIS, I. 1998. Asy mptotic recurrence and waiting times for stationary processes. J. Theoret. Probab. 11 795 811.
• 5 ORNSTEIN, D. S., and WEISS, B. 1993. Entropy and data compression schemes. IEEE Trans. Inform. Theory 39 78 83.
• 6 ORNSTEIN, D. S. and WEISS, B. 1994. Entropy and recurrence rates for stationary random fields. Unpublished manuscript.
• 7 SHIELDS, P. C. 1992. Entropy and prefixes. Ann. Probab. 20 403 409.
• 8 SHIELDS, P. C. 1993. Waiting times: positive and negative results on the Wy ner Ziv problem. J. Theoret. Probab. 6 499 519.
• 9 SHIELDS, P. C. 1996. The Ergodic Theory of Discrete Sample Paths. Amer. Math. Soc., Providence, RI.
• 10 Wy NER, A. D. and ZIV, J. 1989. Some asy mptotic properties of the entropy of a stationary ergodic source with applications to data compression. IEEE Trans. Inform. Theory 35 1250 1258.
• 11 Wy NER, A. D., ZIV, J. and Wy NER, A. J. 1999. On the role of pattern matching in information theory. IEEE Trans. Inform. Theory 44 2045 2056.
• 12 Wy NER, A. J. 1993. String matching theorems and applications to data compression and statistics. Ph.D. dissertation, Stanford Univ.
• 13 Wy NER, A. J. 1996. Entropy estimation and patterns. In Proceedings of the 1996 ISIT Workshop, Haifa, Israel.
• 14 Wy NER, A. J. 1997. The redundancy and distribution of phrase lengths of the FDLZ algorithm. IEEE Trans. Inform. Theory 43 1452 1464.
• PHILADELPHIA, PENNSy LVANIA 19104-6302 E-MAIL: ajw@wharton.upenn.edu