Estimating beta-mixing coefficients via histograms

Daniel J. McDonald; Cosma Rohilla Shalizi; Mark Schervish

doi:10.1214/15-EJS1094

2015 Estimating beta-mixing coefficients via histograms

Daniel J. McDonald, Cosma Rohilla Shalizi, Mark Schervish

Electron. J. Statist. 9(2): 2855-2883 (2015). DOI: 10.1214/15-EJS1094

Abstract

The literature on statistical learning for time series often assumes asymptotic independence or “mixing” of the data-generating process. These mixing assumptions are never tested, nor are there methods for estimating mixing coefficients from data. Additionally, for many common classes of processes (Markov processes, ARMA processes, etc.) general functional forms for various mixing rates are known, but not specific coefficients. We present the first estimator for beta-mixing coefficients based on a single stationary sample path and show that it is risk consistent. Since mixing rates depend on infinite-dimensional dependence, we use a Markov approximation based on only a finite memory length $d$. We present convergence rates for the Markov approximation and show that as $d\rightarrow\infty$, the Markov approximation converges to the true mixing coefficient. Our estimator is constructed using $d$-dimensional histogram density estimates. Allowing asymptotics in the bandwidth as well as the dimension, we prove $L^{1}$ concentration for the histogram as an intermediate step. Simulations wherein the mixing rates are calculable and a real-data example demonstrate our methodology.

References

1.

[1] Athreya, K. B. and Pantula, S. G. (1986). A note on strong mixing of ARMA processes., Statistics & Probability Letters 4 187–190. MR848715[1] Athreya, K. B. and Pantula, S. G. (1986). A note on strong mixing of ARMA processes., Statistics & Probability Letters 4 187–190. MR848715

2.

[2] Bickel, P. J. and Rosenblatt, M. (1973). On some global measures of the deviations of density function estimates., The Annals of Statistics 1 1071–1095. MR348906 10.1214/aos/1176342558 euclid.aos/1176342558 [2] Bickel, P. J. and Rosenblatt, M. (1973). On some global measures of the deviations of density function estimates., The Annals of Statistics 1 1071–1095. MR348906 10.1214/aos/1176342558 euclid.aos/1176342558

3.

[3] Bosq, D. (1998)., Nonparametric Statistics for Stochastic Processes: Estimation and Prediction, 2nd ed. Springer Verlag, New York. MR1640691[3] Bosq, D. (1998)., Nonparametric Statistics for Stochastic Processes: Estimation and Prediction, 2nd ed. Springer Verlag, New York. MR1640691

4.

[4] Bradley, R. C. (1983). Absolute regularity and functions of Markov chains., Stochastic Processes and their Applications 14 67–77. MR676274 0491.60028 10.1016/0304-4149(83)90047-9[4] Bradley, R. C. (1983). Absolute regularity and functions of Markov chains., Stochastic Processes and their Applications 14 67–77. MR676274 0491.60028 10.1016/0304-4149(83)90047-9

5.

[5] Bradley, R. C. (2005). Basic properties of strong mixing conditions. A survey and some open questions., Probability Surveys 2 107–144. MR2178042 1189.60077 10.1214/154957805100000104 euclid.ps/1115386870 [5] Bradley, R. C. (2005). Basic properties of strong mixing conditions. A survey and some open questions., Probability Surveys 2 107–144. MR2178042 1189.60077 10.1214/154957805100000104 euclid.ps/1115386870

6.

[6] Carrasco, M. and Chen, X. (2002). Mixing and moment properties of various GARCH and stochastic volatility models., Econometric Theory 18 17–39. MR1885348 1181.62125 10.1017/S0266466602181023[6] Carrasco, M. and Chen, X. (2002). Mixing and moment properties of various GARCH and stochastic volatility models., Econometric Theory 18 17–39. MR1885348 1181.62125 10.1017/S0266466602181023

7.

[7] Corless, R. M., Gonnet, G. H., Hare, D. E. G., Jeffrey, D. J. and Knuth, D. E. (1996). On the Lambert $W$ function., Advances in Computational Mathematics 5 329–359. MR1414285 0863.65008 10.1007/BF02124750[7] Corless, R. M., Gonnet, G. H., Hare, D. E. G., Jeffrey, D. J. and Knuth, D. E. (1996). On the Lambert $W$ function., Advances in Computational Mathematics 5 329–359. MR1414285 0863.65008 10.1007/BF02124750

8.

[8] Davydov, Y. A. (1973). Mixing conditions for Markov chains., Theory of Probability and its Applications 18 312–328. MR321183[8] Davydov, Y. A. (1973). Mixing conditions for Markov chains., Theory of Probability and its Applications 18 312–328. MR321183

9.

[9] Dedecker, J., Doukhan, P., Lang, G., Leon R., J. R., Louhichi, S. and Prieur, C. (2007)., Weak Dependence: With Examples and Applications. Springer Verlag, New York. MR2338725 1165.62001[9] Dedecker, J., Doukhan, P., Lang, G., Leon R., J. R., Louhichi, S. and Prieur, C. (2007)., Weak Dependence: With Examples and Applications. Springer Verlag, New York. MR2338725 1165.62001

10.

[10] Devroye, L. and Györfi, L. (1985)., Nonparametric Density Estimation: The $L_1$ View. John Wiley & Sons, Inc., New York. MR780746 0546.62015[10] Devroye, L. and Györfi, L. (1985)., Nonparametric Density Estimation: The $L_1$ View. John Wiley & Sons, Inc., New York. MR780746 0546.62015

11.

[11] Doukhan, P. (1994)., Mixing: Properties and Examples. Springer Verlag, New York. MR1312160[11] Doukhan, P. (1994)., Mixing: Properties and Examples. Springer Verlag, New York. MR1312160

12.

[12] Eberlein, E. (1984). Weak convergence of partial sums of absolutely regular sequences., Statistics & Probability Letters 2 291–293. MR777842[12] Eberlein, E. (1984). Weak convergence of partial sums of absolutely regular sequences., Statistics & Probability Letters 2 291–293. MR777842

13.

[13] Freedman, D. and Diaconis, P. (1981a). On the histogram as a density estimator: $L_2$ theory., Probability Theory and Related Fields 57 453–476. MR631370[13] Freedman, D. and Diaconis, P. (1981a). On the histogram as a density estimator: $L_2$ theory., Probability Theory and Related Fields 57 453–476. MR631370

14.

[14] Freedman, D. and Diaconis, P. (1981b). On the maximum deviation between the histogram and the underlying density., Probability Theory and Related Fields 58 139–167. MR637047[14] Freedman, D. and Diaconis, P. (1981b). On the maximum deviation between the histogram and the underlying density., Probability Theory and Related Fields 58 139–167. MR637047

15.

[15] Fryzlewicz, P. and Subba Rao, S. (2011). Mixing properties of ARCH and time-varying ARCH processes., Bernoulli 17 320–346. MR2797994 10.3150/10-BEJ270 euclid.bj/1297173845 [15] Fryzlewicz, P. and Subba Rao, S. (2011). Mixing properties of ARCH and time-varying ARCH processes., Bernoulli 17 320–346. MR2797994 10.3150/10-BEJ270 euclid.bj/1297173845

16.

[16] Gneiting, T., Balabdaoui, F. and Raftery, A. E. (2007). Probabilistic forecasts, calibration and sharpness., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 69 243–268. MR2325275 1120.62074 10.1111/j.1467-9868.2007.00587.x[16] Gneiting, T., Balabdaoui, F. and Raftery, A. E. (2007). Probabilistic forecasts, calibration and sharpness., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 69 243–268. MR2325275 1120.62074 10.1111/j.1467-9868.2007.00587.x

17.

[17] Halmos, P. R. (1974)., Measure Theory. Graduate Texts in Mathematics. Springer-Verlag, New York. MR453532[17] Halmos, P. R. (1974)., Measure Theory. Graduate Texts in Mathematics. Springer-Verlag, New York. MR453532

18.

[18] Hansen, L. P. and Heckman, J. J. (1996). The empirical foundations of calibration., The Journal of Economic Perspectives 87–104.[18] Hansen, L. P. and Heckman, J. J. (1996). The empirical foundations of calibration., The Journal of Economic Perspectives 87–104.

19.

[19] Kantz, H. and Schreiber, T. (2004)., Nonlinear time series analysis 7. Cambridge university press. MR2040330[19] Kantz, H. and Schreiber, T. (2004)., Nonlinear time series analysis 7. Cambridge university press. MR2040330

20.

[20] Karandikar, R. L. and Vidyasagar, M. (2009). Probably Approximately Correct Learning with Beta-Mixing Input Sequences. submitted for, publication.[20] Karandikar, R. L. and Vidyasagar, M. (2009). Probably Approximately Correct Learning with Beta-Mixing Input Sequences. submitted for, publication.

21.

[21] McDiarmid, C. (1989). On the Method of Bounded Differences. In, Surveys in Combinatorics (J. Siemons, ed.) 148–188. Cambridge University Press. MR1036755 0712.05012 10.1017/CBO9781107359949.008[21] McDiarmid, C. (1989). On the Method of Bounded Differences. In, Surveys in Combinatorics (J. Siemons, ed.) 148–188. Cambridge University Press. MR1036755 0712.05012 10.1017/CBO9781107359949.008

22.

[22] Meir, R. (2000). Nonparametric time series prediction through adaptive model selection., Machine Learning 39 5–34.[22] Meir, R. (2000). Nonparametric time series prediction through adaptive model selection., Machine Learning 39 5–34.

23.

[23] Mohri, M. and Rostamizadeh, A. (2010). Stability bounds for stationary $\varphi$-mixing and $\beta$-mixing processes., Journal of Machine Learning Research 11 789–814. MR2600630 1242.68238[23] Mohri, M. and Rostamizadeh, A. (2010). Stability bounds for stationary $\varphi$-mixing and $\beta$-mixing processes., Journal of Machine Learning Research 11 789–814. MR2600630 1242.68238

24.

[24] Mokkadem, A. (1988). Mixing properties of ARMA processes., Stochastic Processes and their Applications 29 309–315. MR958507 0647.60042 10.1016/0304-4149(88)90045-2[24] Mokkadem, A. (1988). Mixing properties of ARMA processes., Stochastic Processes and their Applications 29 309–315. MR958507 0647.60042 10.1016/0304-4149(88)90045-2

25.

[25] Nobel, A. B. (2006). Hypothesis testing for families of ergodic processes., Bernoulli 12 251–269. MR2218555 10.3150/bj/1145993974 euclid.bj/1145993974 [25] Nobel, A. B. (2006). Hypothesis testing for families of ergodic processes., Bernoulli 12 251–269. MR2218555 10.3150/bj/1145993974 euclid.bj/1145993974

26.

[26] Pham, T. D. and Tran, L. T. (1985). Some mixing properties of time series models., Stochastic processes and their applications 19 297–303. MR787587 0564.62068 10.1016/0304-4149(85)90031-6[26] Pham, T. D. and Tran, L. T. (1985). Some mixing properties of time series models., Stochastic processes and their applications 19 297–303. MR787587 0564.62068 10.1016/0304-4149(85)90031-6

27.

[27] Schervish, M. J. (1995)., Theory of statistics. Springer Series in Statistics. Springer Verlag, New York. MR1354146 0834.62002[27] Schervish, M. J. (1995)., Theory of statistics. Springer Series in Statistics. Springer Verlag, New York. MR1354146 0834.62002

28.

[28] Silverman, B. W. (1978). Weak and strong uniform consistency of the kernel estimate of a density and its derivatives., The Annals of Statistics 6 177–184. MR471166 10.1214/aos/1176344076 euclid.aos/1176344076 [28] Silverman, B. W. (1978). Weak and strong uniform consistency of the kernel estimate of a density and its derivatives., The Annals of Statistics 6 177–184. MR471166 10.1214/aos/1176344076 euclid.aos/1176344076

29.

[29] Steinwart, I. and Anghel, M. (2009). Consistency of support vector machines for forecasting the evolution of an unknown ergodic dynamical system from observations with unknown noise., The Annals of Statistics 37 841–875. MR2502653 1162.62089 10.1214/07-AOS562 euclid.aos/1236693152 [29] Steinwart, I. and Anghel, M. (2009). Consistency of support vector machines for forecasting the evolution of an unknown ergodic dynamical system from observations with unknown noise., The Annals of Statistics 37 841–875. MR2502653 1162.62089 10.1214/07-AOS562 euclid.aos/1236693152

30.

[30] Tran, L. T. (1989). The $L_1$ convergence of kernel density estimates under dependence., The Canadian Journal of Statistics/La Revue Canadienne de Statistique 17 197–208. MR1033102[30] Tran, L. T. (1989). The $L_1$ convergence of kernel density estimates under dependence., The Canadian Journal of Statistics/La Revue Canadienne de Statistique 17 197–208. MR1033102

31.

[31] Tran, L. T. (1994). Density estimation for time series by histograms., Journal of statistical planning and inference 40 61–79. MR1278848 0815.62023 10.1016/0378-3758(94)90142-2[31] Tran, L. T. (1994). Density estimation for time series by histograms., Journal of statistical planning and inference 40 61–79. MR1278848 0815.62023 10.1016/0378-3758(94)90142-2

32.

[32] Vapnik, V. N. (2000)., The Nature of Statistical Learning Theory, 2nd ed. Springer Verlag, New York. MR1719582[32] Vapnik, V. N. (2000)., The Nature of Statistical Learning Theory, 2nd ed. Springer Verlag, New York. MR1719582

33.

[33] Vidyasagar, M. (1997)., A Theory of Learning and Generalization: With Applications to Neural Networks and Control Systems. Springer Verlag, Berlin. MR1482231 0928.68061[33] Vidyasagar, M. (1997)., A Theory of Learning and Generalization: With Applications to Neural Networks and Control Systems. Springer Verlag, Berlin. MR1482231 0928.68061

34.

[34] Volkonskii, V. and Rozanov, Y. A. (1959). Some limit theorems for random functions. I., Theory of Probability and its Applications 4 178–197. MR121856 10.1137/1104015[34] Volkonskii, V. and Rozanov, Y. A. (1959). Some limit theorems for random functions. I., Theory of Probability and its Applications 4 178–197. MR121856 10.1137/1104015

35.

[35] Weiss, B. (1973). Subshifts of finite type and sofic systems., Monatshefte für Mathematik 77 462–474. MR340556 0285.28021 10.1007/BF01295322[35] Weiss, B. (1973). Subshifts of finite type and sofic systems., Monatshefte für Mathematik 77 462–474. MR340556 0285.28021 10.1007/BF01295322

36.

[36] Withers, C. S. (1981). Conditions for linear processes to be strong-mixing., Probability Theory and Related Fields 57 477–480. MR631371[36] Withers, C. S. (1981). Conditions for linear processes to be strong-mixing., Probability Theory and Related Fields 57 477–480. MR631371

37.

[37] Woodroofe, M. (1967). On the maximum deviation of the sample density., The Annals of Mathematical Statistics 38 475–481. MR211448 0157.48002 10.1214/aoms/1177698963 euclid.aoms/1177698963 [37] Woodroofe, M. (1967). On the maximum deviation of the sample density., The Annals of Mathematical Statistics 38 475–481. MR211448 0157.48002 10.1214/aoms/1177698963 euclid.aoms/1177698963

38.

[38] Yu, B. (1993). Density estimation in the $L_\infty$ norm for dependent data with applications to the Gibbs sampler., Annals of Statistics 21 711–735. MR1232514 0792.62035 10.1214/aos/1176349146 euclid.aos/1176349146 [38] Yu, B. (1993). Density estimation in the $L_\infty$ norm for dependent data with applications to the Gibbs sampler., Annals of Statistics 21 711–735. MR1232514 0792.62035 10.1214/aos/1176349146 euclid.aos/1176349146

39.

[39] Yu, B. (1994). Rates of convergence for empirical processes of stationary mixing sequences., The Annals of Probability 22 94–116. MR1258867 0802.60024 10.1214/aop/1176988849 euclid.aop/1176988849 [39] Yu, B. (1994). Rates of convergence for empirical processes of stationary mixing sequences., The Annals of Probability 22 94–116. MR1258867 0802.60024 10.1214/aop/1176988849 euclid.aop/1176988849

Citation Download Citation

Daniel J. McDonald, Cosma Rohilla Shalizi, and Mark Schervish "Estimating beta-mixing coefficients via histograms," Electronic Journal of Statistics 9(2), 2855-2883, (2015). https://doi.org/10.1214/15-EJS1094

Received: 1 December 2014; Published: 2015

Access the abstract

JOURNAL ARTICLE
29 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY