The Annals of Applied Probability

The asymptotic distribution and Berry–Esseen bound of a new test for independence in high dimension with an application to stochastic optimization

Wei-Dong Liu, Zhengyan Lin, and Qi-Man Shao

Full-text: Open access

Abstract

Let X1, …, Xn be a random sample from a p-dimensional population distribution. Assume that c1nαpc2nα for some positive constants c1, c2 and α. In this paper we introduce a new statistic for testing independence of the p-variates of the population and prove that the limiting distribution is the extreme distribution of type I with a rate of convergence $O((\log n)^{5/2}/\sqrt{n})$. This is much faster than O(1/log n), a typical convergence rate for this type of extreme distribution. A simulation study and application to stochastic optimization are discussed.

Article information

Source
Ann. Appl. Probab., Volume 18, Number 6 (2008), 2337-2366.

Dates
First available in Project Euclid: 26 November 2008

Permanent link to this document
https://projecteuclid.org/euclid.aoap/1227708921

Digital Object Identifier
doi:10.1214/08-AAP527

Mathematical Reviews number (MathSciNet)
MR2474539

Zentralblatt MATH identifier
1154.60021

Subjects
Primary: 60F05: Central limit and other weak theorems
Secondary: 62F05: Asymptotic properties of tests

Keywords
Independence test extreme distribution Berry–Esseen bound correlation matrices stochastic optimization

Citation

Liu, Wei-Dong; Lin, Zhengyan; Shao, Qi-Man. The asymptotic distribution and Berry–Esseen bound of a new test for independence in high dimension with an application to stochastic optimization. Ann. Appl. Probab. 18 (2008), no. 6, 2337--2366. doi:10.1214/08-AAP527. https://projecteuclid.org/euclid.aoap/1227708921


Export citation

References

  • [1] Arratia, R., Goldstein, L. and Gordon, L. (1989). Two moments suffice for Poisson approximations: The Chen–Stein method. Ann. Probab. 17 9–25.
  • [2] Candès, E. J., Romberg, J. and Tao, T. (2006). Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inform. Theory 52 489–509.
  • [3] Chen, J. and X. Huo, X. (2005). Sparse representations for multiple measurement vectors (MMV) in an over-complete dictionary. In International Conference on Acoustics, Speech and Signal Processing (ICASSP-2005). Philadelphia, PA.
  • [4] Donoho, D. L. (2000). High-dimensional date analysis: The curses and blessings of dimensionality. Available at http://www-stat.stanford.edu/~donoho/.
  • [5] Donoho, D. L. (2006). For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution. Comm. Pure Appl. Math. 59 797–829.
  • [6] Donoho, D. L. and Elad, M. (2003). Optimally sparse representation in general (nonorthogonal) dictionaries via l1 minimization. Proc. Natl. Acad. Sci. USA 100 2197–2202 (electronic).
  • [7] Donoho, D. L., Elad, M. and Temlyakov, V. N. (2006). Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Trans. Inform. Theory 52 6–18.
  • [8] Donoho, D. L. and Huo, X. (2001). Uncertainty principles and ideal atomic decomposition. IEEE Trans. Inform. Theory 47 2845–2862.
  • [9] Donoho, D. L. and Stark, P. B. (1989). Uncertainty principles and signal recovery. SIAM J. Appl. Math. 49 906–931.
  • [10] Einmahl, U. and Li, D. (2008). Characterization of LIL behavior in Banach space. Trans. Amer. Math. Soc. 360 6677–6693.
  • [11] Fan, J. and Li, R. (2006). Statistical challenges with high dimensionality: Feature selection in knowledge discovery. In International Congress of Mathematicians III 595–622. Eur. Math. Soc., Zürich.
  • [12] Galambos, J., Lechner, J. and Simiu, E., eds. (1994). Extreme Value Theory and Applications. I. Kluwer Academic, Dordrecht.
  • [13] Hall, P. (1979). On the rate of convergence of normal extremes. J. Appl. Probab. 16 433–439.
  • [14] Jiang, T. (2004). The asymptotic distributions of the largest entries of sample correlation matrices. Ann. Appl. Probab. 14 865–880.
  • [15] Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295–327.
  • [16] Leadbetter, M. R., Lindgren, G. and Rootzén, H. (1983). Extremes and Related Properties of Random Sequences and Processes. Springer, New York.
  • [17] Ledoit, O. and Wolf, M. (2002). Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size. Ann. Statist. 30 1081–1102.
  • [18] Petrov, V. V. (1975). Sums of Independent Random Variables. Springer, New York.
  • [19] Sakhanenko, A. I. (1991). Estimates of Berry–Esseen type for the probabilities of large deviations. Sibirsk. Mat. Zh. 32 133–142, 228.
  • [20] Schott, J. R. (2005). Testing for complete independence in high dimensions. Biometrika 92 951–956.
  • [21] Zhou, W. (2007). Asymptotic distribution of the largest off-diagonal entry of correlation matrices. Trans. Amer. Math. Soc. 359 5345–5363.