The Annals of Applied Statistics

Approximate null distribution of the largest root in multivariate analysis

Iain M. Johnstone

Full-text: Open access

Abstract

The greatest root distribution occurs everywhere in classical multivariate analysis, but even under the null hypothesis the exact distribution has required extensive tables or special purpose software. We describe a simple approximation, based on the Tracy–Widom distribution, that in many cases can be used instead of tables or software, at least for initial screening. The quality of approximation is studied, and its use illustrated in a variety of setttings.

Article information

Source
Ann. Appl. Stat. Volume 3, Number 4 (2009), 1616-1633.

Dates
First available in Project Euclid: 1 March 2010

Permanent link to this document
http://projecteuclid.org/euclid.aoas/1267453956

Digital Object Identifier
doi:10.1214/08-AOAS220

Zentralblatt MATH identifier
1184.62083

Mathematical Reviews number (MathSciNet)
MR2752150

Citation

Johnstone, Iain M. Approximate null distribution of the largest root in multivariate analysis. Ann. Appl. Stat. 3 (2009), no. 4, 1616--1633. doi:10.1214/08-AOAS220. http://projecteuclid.org/euclid.aoas/1267453956.


Export citation

References

  • Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley, Hoboken, NJ.
  • Andrews, D. F. and Herzberg, A. M. (1985). Data. Springer, New York.
  • Chen, W. W. (2002). Some new tables of the largest root of a matrix in multivariate analysis: A computer approach from 2 to 6. Presented at the 2002 American Statistical Association.
  • Chen, W. W. (2003). Table for upper percentage points of the largest root of a determinantal equation with five roots. InterStat (5). Available at interstat.statjournals.net.
  • Chen, W. W. (2004a). The new table for upper percentage points of the largest root of a determinantal equation with seven roots. InterStat (1). Available at interstat.statjournals.net.
  • Chen, W. W. (2004b). Some new tables for the upper probability points of the largest root of a determinantal equation with seven and eight roots. In Special Studies in Federal Tax Statistics. Statistics of Income Division, Internal Revenue Service (J. Dalton and B. Kilss, eds.) 113–116.
  • Constantine, A. G. (1963). Some non-central distribution problems in multivariate analysis. Ann. Math. Statist. 34 1270–1285.
  • Davis, A. W. (1972). On the marginal distributions of the latent roots of the multivariate beta matrix. Ann. Math. Statist. 43 1664–1670.
  • Foster, F. G. (1957). Upper percentage points of the generalized Beta distribution. II. Biometrika 44 441–453.
  • Foster, F. G. (1958). Upper percentage points of the generalized Beta distribution. III. Biometrika 45 492–503.
  • Foster, F. G. and Rees, D. H. (1957). Upper percentage points of the generalized Beta distribution. I. Biometrika 44 237–247.
  • Golub, G. H. and Van Loan, C. F. (1996). Matrix Computations, 3rd ed. Johns Hopkins Univ. Press, Baltimore.
  • Heck, D. L. (1960). Charts of some upper percentage points of the distribution of the largest characteristic root. Ann. Math. Statist. 31 625–642.
  • Johnson, R. A. and Wichern, D. W. (2002). Applied Multivariate Statistical Analysis, 6th ed. Pearson Prentice Hall, Upper Saddle River, NJ.
  • Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295–327.
  • Johnstone, I. M. (2008). Multivariate analysis and Jacobi ensembles: Largest eigenvalue, Tracy–Widom limits and rates of convergence. Ann. Statist. 36 2638–2716.
  • Johnstone, I. M. and Chen, W. W. (2007). Finite sample accuracy of Tracy–Widom approximation for multivariate analysis. In 2007 JSM Proceedings 1161–1166. Amer. Statist. Assoc., Alexandria, VA.
  • Johnstone, I. M., Ma, Z., Perry, P. O. and Shahram, M. (2010). RMTstat: Distributions, statistics and tests derived from random matrix theory. Manuscript in preparation.
  • Koev, P. (2010). Computing multivariate statistics. Manuscript in preparation.
  • Koev, P. and Edelman, A. (2006). The efficient evaluation of the hypergeometric function of a matrix argument. Math. Comp. 75 833–846 (electronic).
  • Krishnaiah, P. R. (1980). Computations of some multivariate distributions. In Handbook of Statistics, Volume 1—Analysis of Variance (P. R. Krishnaiah, ed.) 745–971. North-Holland, Amsterdam.
  • Lutz, J. G. (1992). A Turbo Pascal unit for approximating the cumulative distribution function of Roy’s largest root criterion. Educational and Psychological Measurement 52 899–904.
  • Lutz, J. G. (2000). Roy table: A program for generating tables of critical values for Roy’s largest root criterion. Educational and Psychological Measurement 60 644–647.
  • Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). Multivariate Analysis. Academic Press, London.
  • Morrison, D. F. (2005). Multivariate Statistical Methods, 4th ed. Thomson, Belmont, CA.
  • Muirhead, R. J. (1982). Aspects of Multivariate Statistical Theory. Wiley, New York.
  • Nanda, D. N. (1948). Distribution of a root of a determinantal equation. Ann. Math. Statist. 19 47–57.
  • Nanda, D. N. (1951). Probability distribution tables of the largest root of a determinantal equation with two roots. J. Indian Soc. Agricultural Statist. 3 175–177.
  • Péché, S. (2009). Universality results for largest eigenvalues of some sample covariance matrices ensembles. Probab. Theory Related Fields 143 481–516.
  • Pillai, K. C. S. (1955). Some new test criteria in multivariate analysis. Ann. Math. Statist. 26 117–121.
  • Pillai, K. C. S. (1956a). On the distribution of the largest or smallest root of a matrix in multivariate analysis. Biometrika 43 122–127.
  • Pillai, K. C. S. (1956b). Some results useful in multivariate analysis. Ann. Math. Statist. 27 1106–1114.
  • Pillai, K. C. S. (1957). Concise Tables for Statisticians. The Statistical Center, Univ. of the Philippines, Manila.
  • Pillai, K. C. S. (1965). On the distribution of the largest characteristic root of a matrix in multivariate analysis. Biometrika 52 405–414.
  • Pillai, K. C. S. (1967). Upper percentage points of the largest root of a matrix in multivariate analysis. Biometrika 54 189–194.
  • Pillai, K. C. S. and Bantegui, C. G. (1959). On the distribution of the largest of six roots of a matrix in multivariate analysis. Biometrika 46 237–240.
  • Pillai, K. C. S. and Flury, B. N. (1984). Percentage points of the largest characteristic root of the multivariate beta matrix. Commun. Statist. Part A 13 2199–2237.
  • Rencher, A. C. (2002). Methods of Multivariate Analysis, 2nd ed. Wiley, New York.
  • Soshnikov, A. (2002). A note on universality of the distribution of the largest eigenvalues in certain classes of sample covariance matrices. J. Statist. Phys. 108 1033–1056.
  • Tracy, C. A. and Widom, H. (1996). On orthogonal and symplectic matrix ensembles. Commun. Math. Phys. 177 727–754.
  • Waugh, F. V. (1942). Regressions between sets of variables. Econometrica 10 290–310.