Source: Ann. Statist. Volume 36, Number 3
(2008), 1261-1298.
We propose a class of locally and asymptotically optimal tests, based on multivariate ranks and signs for the homogeneity of scatter matrices in m elliptical populations. Contrary to the existing parametric procedures, these tests remain valid without any moment assumptions, and thus are perfectly robust against heavy-tailed distributions (validity robustness). Nevertheless, they reach semiparametric efficiency bounds at correctly specified elliptical densities and maintain high powers under all (efficiency robustness). In particular, their normal-score version outperforms traditional Gaussian likelihood ratio tests and their pseudo-Gaussian robustifications under a very broad range of non-Gaussian densities including, for instance, all multivariate Student and power-exponential distributions.
References
[1] Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley, Hoboken, NJ.
[2] Bartlett, M. S. (1937). Properties of sufficiency and statistical tests. Proc. Roy. London Soc. Ser. A 160 268–282.
[3] Bartlett, M. S. and Kendall, D. G. (1946). The statistical analysis of variance-heterogeneity and the logarithmic transformation. Suppl. J. Roy. Statist. Soc. 8 128–138.
Mathematical Reviews (MathSciNet):
MR19879
[4] Bickel, P. J. (1982). On adaptive estimation. Ann. Statist. 10 647–671.
Mathematical Reviews (MathSciNet):
MR663424
[5] Box, G. E. P. (1953). Non-normality and tests on variances. Biometrika 40 318–335.
Mathematical Reviews (MathSciNet):
MR58937
[6] Cochran, W. G. (1941). The distribution of the largest of a set of estimated variances as a fraction of their total. Ann. Eugenics 11 47–52.
Mathematical Reviews (MathSciNet):
MR5560
[7] Conover, W. J., Johnson, M. E. and Johnson, M. M. (1981). Comparative study of tests for homogeneity of variances, with applications to the outer continental shelf bidding data. Technometrics 23 351–361.
[8] Dümbgen, L. (1998). On Tyler’s M-functional of scatter in high dimension. Ann. Inst. Statist. Math. 50 471–491.
[9] Dümbgen, L. and Tyler, D. E. (2005). On the breakdown properties of some multivariate M-functionals. Scand. J. Statist. 32 247–264.
[10] Fligner, M. A. and Killeen, T. J. (1976). Distribution-free two-sample tests for scale. J. Amer. Statist. Assoc. 71 210–213.
Mathematical Reviews (MathSciNet):
MR400532
[11] Goodnight, C. J. and Schwartz, J. M. (1997). A bootstrap comparison of genetic covariance matrices. Biometrics 53 1026–1039.
[12] Gupta, A. K. and Xu, J. (2006). On some tests of the covariance matrix under general conditions. Ann. Inst. Statist. Math. 58 101–114.
[13] Hájek, I. (1968). Asymptotic normality of simple linear rank statistics under alternatives. Ann. Math. Statist. 39 325–346.
[14] Hallin, M., Oja, H. and Paindaveine, D. (2006). Semiparametrically efficient rank-based inference for shape. II. Optimal R-estimation of shape. Ann. Statist. 34 2757–2789.
[15] Hallin, M. and Paindaveine, D. (2002). Optimal tests for multivariate location based on interdirections and pseudo-Mahalanobis ranks. Ann. Statist. 30 1103–1133.
[16] Hallin, M. and Paindaveine, D. (2004). Rank-based optimal tests of the adequacy of an elliptic VARMA model. Ann. Statist. 32 2642–2678.
[17] Hallin, M. and Paindaveine, D. (2006). Semiparametrically efficient rank-based inference for shape. I. Optimal rank-based tests for sphericity. Ann. Statist. 34 2707–2756.
[18] Hallin, M. and Paindaveine, D. (2006). Parametric and semiparametric inference for shape: The role of the scale functional. Statist. Decisions 24 1001–1023.
[19] Hallin, M. and Paindaveine, D. (2007). Optimal tests for homogeneity of covariance, scale, and shape. J. Multivariate Anal. To appear.
[20] Hallin, M. and Werker, B. J. M. (2003). Semiparametric efficiency, distribution-freeness, and invariance. Bernoulli 9 137–165.
[21] Hartley, H. O. (1950). The maximum F-ratio as a shortcut test for heterogeneity of variance. Biometrika 37 308–312.
[22] Heritier, S. and Ronchetti, E. (1994). Robust bounded-influence tests in general parametric models. J. Amer. Statist. Assoc. 89 897–904.
[23] Hettmansperger, T. P. and Randles, R. H. (2002). A practical affine equivariant multivariate median. Biometrika 89 851–860.
[24] Jurečková, J. (1969). Asymptotic linearity of a rank statistic in regression parameter. Ann. Math. Statist. 40 1889–1900.
[25] Kreiss, J. P. (1987). On adaptive estimation in stationary ARMA processes. Ann. Statist. 15 112–133.
Mathematical Reviews (MathSciNet):
MR885727
[26] Le Cam, L. (1986). Asymptotic Methods in Statistical Decision Theory. Springer, New York.
Mathematical Reviews (MathSciNet):
MR856411
[27] Nagao, H. (1973). On some test criteria for covariance matrix. Ann. Statist. 1 700–709.
Mathematical Reviews (MathSciNet):
MR339405
[28] Ollila, E., Hettmansperger, T. P. and Oja, H. (2004). Affine equivariant multivariate sign methods. Preprint, Univ. Jyväskylä.
[29] Paindaveine, D. (2006). A Chernoff–Savage result for shape. On the non-admissibility of pseudo-Gaussian methods. J. Multivariate Anal. 97 2206–2220.
[30] Paindaveine, D. (2007). A canonical definition of shape. Submitted.
[31] Perlman, M. D. (1980). Unbiasedness of the likelihood ratio tests for equality of several covariance matrices and equality of several multivariate normal populations. Ann. Statist. 8 247–263.
Mathematical Reviews (MathSciNet):
MR560727
[32] Puri, M. L. and Sen, P. K. (1985). Nonparametric Methods in General Linear Models. Wiley, New York.
Mathematical Reviews (MathSciNet):
MR794309
[33] Randles, R. H. (2000). A simpler, affine-invariant, multivariate, distribution-free sign test. J. Amer. Statist. Assoc. 95 1263–1268.
[34] Salibian-Barrera, M., Van Aelst, S. and Willems, G. (2006). Principal components analysis based on multivariate MM-estimators with fast and robust bootstrap. J. Amer. Statist. Assoc. 101 1198–1211.
[35] Schott, J. R. (2001). Some tests for the equality of covariance matrices. J. Statist. Plann. Inference 94 25–36.
[36] Taskinen, S., Croux, C., Kankainen, A., Ollila, E. and Oja, H. (2006). Influence functions and efficiencies of the canonical correlation and vector estimates based on scatter and shape matrices. J. Multivariate Anal. 97 359–384.
[37] Tatsuoka, K. S. and Tyler, D. E. (2000). On the uniqueness of S-functionals and M-functionals under nonelliptical distributions. Ann. Statist. 28 1219–1243.
[38] Tyler, D. E. (1983). Robustness and efficiency properties of scatter matrices. Biometrika 70 411–420.
Mathematical Reviews (MathSciNet):
MR712028
[39] Tyler, D. E. (1987). A distribution-free M-estimator of multivariate scatter. Ann. Statist. 15 234–251.
Mathematical Reviews (MathSciNet):
MR885734
[40] Um, Y. and Randles, R. H. (1998). Nonparametric tests for the multivariate multi-sample location problem. Statist. Sinica 8 801–812.
[41] Yanagihara, H., Tonda, T. and Matsumoto, C. (2005). The effects of non-normality on asymptotic distributions of some likelihood ratio criteria for testing covariance structures under normal assumption. J. Multivariate Anal. 96 237–264.
[42] Zhang, J. and Boos, D. D. (1992). Bootstrap critical values for testing homogeneity of covariance matrices. J. Amer. Statist. Assoc. 87 425–429.
[43] Zhu, L. X., Ng, K. W. and Jing, P. (2002). Resampling methods for homogeneity tests of covariance matrices. Statist. Sinica 12 769–783.