## Annals of Statistics

### Rare-event analysis for extremal eigenvalues of white Wishart matrices

#### Abstract

In this paper, we consider the extreme behavior of the extremal eigenvalues of white Wishart matrices, which plays an important role in multivariate analysis. In particular, we focus on the case when the dimension of the feature $p$ is much larger than or comparable to the number of observations $n$, a common situation in modern data analysis. We provide asymptotic approximations and bounds for the tail probabilities of the extremal eigenvalues. Moreover, we construct efficient Monte Carlo simulation algorithms to compute the tail probabilities. Simulation results show that our method has the best performance among known approximation approaches, and furthermore provides an efficient and accurate way for evaluating the tail probabilities in practice.

#### Article information

Source
Ann. Statist., Volume 45, Number 4 (2017), 1609-1637.

Dates
Revised: July 2016
First available in Project Euclid: 28 June 2017

https://projecteuclid.org/euclid.aos/1498636868

Digital Object Identifier
doi:10.1214/16-AOS1502

Mathematical Reviews number (MathSciNet)
MR3670190

Zentralblatt MATH identifier
1377.65013

#### Citation

Jiang, Tiefeng; Leder, Kevin; Xu, Gongjun. Rare-event analysis for extremal eigenvalues of white Wishart matrices. Ann. Statist. 45 (2017), no. 4, 1609--1637. doi:10.1214/16-AOS1502. https://projecteuclid.org/euclid.aos/1498636868

#### References

• Adler, R. J., Blanchet, J. H. and Liu, J. (2012). Efficient Monte Carlo for high excursions of Gaussian random fields. Ann. Appl. Probab. 22 1167–1214.
• Anderson, G. W., Guionnet, A. and Zeitouni, O. (2010). An Introduction to Random Matrices. Cambridge Univ. Press, Cambridge.
• Asmussen, S. and Glynn, P. W. (2007). Stochastic Simulation: Algorithms and Analysis. Springer, New York.
• Asmussen, S. and Kroese, D. P. (2006). Improved algorithms for rare event simulation with heavy tails. Adv. in Appl. Probab. 38 545–558.
• Bianchi, P., Debbah, M., Maida, M. and Najim, J. (2011). Performance of statistical tests for single-source detection using random matrix theory. IEEE Trans. Inform. Theory 57 2400–2419.
• Blanchet, J. and Glynn, P. (2008). Efficient rare-event simulation for the maximum of heavy-tailed random walks. Ann. Appl. Probab. 18 1351–1378.
• Blanchet, J., Glynn, P. and Leder, K. (2012). On Lyapunov inequalities and subsolutions for efficient importance sampling. ACM Trans. Model. Comput. Simul. 22 Art. 13, 27.
• Blanchet, J. H. and Liu, J. (2008). State-dependent importance sampling for regularly varying random walks. Adv. in Appl. Probab. 40 1104–1128.
• Bordenave, C. and Caputo, P. (2014). A large deviation principle for Wigner matrices without Gaussian tails. Ann. Probab. 42 2454–2496.
• Dumitriu, I. (2003). Eigenvalue Statistics for Beta-Ensembles. Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, MA.
• Dumitriu, I. and Edelman, A. (2002). Matrix models for beta ensembles. J. Math. Phys. 43 5830–5847.
• Dupuis, P., Leder, K. and Wang, H. (2007). Importance sampling for sums of random variables with regularly varying tails. ACM Trans. Model. Comput. Simul. 17 1–14.
• Edelman, A. and Rao, N. R. (2005). Random matrix theory. Acta Numer. 14 233–297.
• El Karoui, N. (2003). On the largest eigenvalue of Wishart matrices with identity covariance when $n$, $p$ and $p/n$ tend to infinity. Preprint. Available at arXiv:Math/0309355.
• Hiai, F. and Petz, D. (1998). Eigenvalue density of the Wishart matrix and large deviations. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 1 633–646.
• James, A. T. (1964). Distributions of matrix variates and latent roots derived from normal samples. Ann. Math. Stat. 35 475–501.
• Jiang, T., Leder, K. and Xu, G. (2017). Supplement to “Rare-event analysis for extremal eigenvalues of white Wishart matrices.” DOI:10.1214/16-AOS1502SUPP.
• Jiang, T. and Li, D. (2014). Approximation of rectangular beta-Laguerre ensembles and large deviations. J. Theoret. Probab. To appear.
• Johansson, K. (2000). Shape fluctuations and random matrices. Comm. Math. Phys. 209 437–476.
• Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295–327.
• Johnstone, I. M. and Ma, Z. (2012). Fast approach to the Tracy–Widom law at the edge of GOE and GUE. Ann. Appl. Probab. 22 1962–1988.
• Johnstone, I. M., Ma, Z., Perry, P. O. and Shahram, M. (2010). RMTstat: Distributions, statistics and tests derived from random matrix theory. Unpublished manuscript.
• Kwapień, J., Drożdż, S. and Speth, J. (2003). Alternation of different fluctuation regimes in the stock market dynamics. Phys. A 330 605–621.
• Kwissa, M., Nakaya, H. I., Onlamoon, N., Wrammert, J., Villinger, F., Perng, G. C., Yoksan, S., Pattanapanyasat, K., Chokephaibulkit, K., Ahmed, R. and Pulendran, B. (2014). Dengue virus infection induces expansion of a CD14+ CD16+ monocyte population that stimulates plasmablast differentiation. Cell Host & Microbe 16 115–127.
• Liu, J. and Xu, G. (2014a). On the conditional distributions and the efficient simulations of exponential integrals of Gaussian random fields. Ann. Appl. Probab. 24 1691–1738.
• Liu, J. and Xu, G. (2014b). Efficient simulations for the exponential integrals of Hölder continuous Gaussian random fields. ACM Trans. Model. Comput. Simul. 24 Art. 9, 24.
• Ma, Z. (2012). Accuracy of the Tracy–Widom limits for the extreme eigenvalues in white Wishart matrices. Bernoulli 18 322–359.
• Macdonald, I. G. (1995). Symmetric Functions and Hall Polynomials, 2nd ed. Oxford Univ. Press, New York.
• Maïda, M. (2007). Large deviations for the largest eigenvalue of rank one deformations of Gaussian ensembles. Electron. J. Probab. 12 1131–1150 (electronic).
• Marčenko, V. A. and Pastur, L. A. (1967). Distribution of eigenvalues for some sets of random matrices. Sb. Math. 1 457–483.
• Muirhead, R. J. (1982). Aspects of Multivariate Statistical Theory. Wiley, New York.
• Patterson, N., Price, A. L. and Reich, D. (2006). Population structure and eigenanalysis. PLoS Genet. 2 e190.
• Ramírez, J. A., Rider, B. and Virág, B. (2011). Beta ensembles, stochastic Airy spectrum, and a diffusion. J. Amer. Math. Soc. 24 919–944.
• Roy, S. N. (1953). On a heuristic method of test construction and its use in multivariate analysis. Ann. Math. Stat. 24 220–238.
• Siegmund, D. (1976). Importance sampling in the Monte Carlo study of sequential tests. Ann. Statist. 4 673–684.
• Xu, G., Lin, G. and Liu, J. (2014). Rare-event simulation for the stochastic Korteweg–de Vries equation. SIAM/ASA J. Uncertain. Quantificat. 2 698–716.

#### Supplemental materials

• Supplement to “Rare-event analysis for extremal eigenvalues of white Wishart matrices.”. The online Supplementary Material contains proofs of technical lemmas (Lemmas 1–9) and Theorem 3.