The Annals of Statistics

Optimal rank-based tests for homogeneity of scatter

Marc Hallin and Davy Paindaveine
Source: Ann. Statist. Volume 36, Number 3 (2008), 1261-1298.

Abstract

We propose a class of locally and asymptotically optimal tests, based on multivariate ranks and signs for the homogeneity of scatter matrices in m elliptical populations. Contrary to the existing parametric procedures, these tests remain valid without any moment assumptions, and thus are perfectly robust against heavy-tailed distributions (validity robustness). Nevertheless, they reach semiparametric efficiency bounds at correctly specified elliptical densities and maintain high powers under all (efficiency robustness). In particular, their normal-score version outperforms traditional Gaussian likelihood ratio tests and their pseudo-Gaussian robustifications under a very broad range of non-Gaussian densities including, for instance, all multivariate Student and power-exponential distributions.

First Page: Show Hide
Primary Subjects: 62M15, 62G35
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1211819564
Digital Object Identifier: doi:10.1214/07-AOS508
Mathematical Reviews number (MathSciNet): MR2418657
Zentralblatt MATH identifier: 05294973

References

[1] Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley, Hoboken, NJ.
Mathematical Reviews (MathSciNet): MR1990662
Zentralblatt MATH: 1039.62044
[2] Bartlett, M. S. (1937). Properties of sufficiency and statistical tests. Proc. Roy. London Soc. Ser. A 160 268–282.
[3] Bartlett, M. S. and Kendall, D. G. (1946). The statistical analysis of variance-heterogeneity and the logarithmic transformation. Suppl. J. Roy. Statist. Soc. 8 128–138.
Mathematical Reviews (MathSciNet): MR19879
Digital Object Identifier: doi:10.2307/2983618
[4] Bickel, P. J. (1982). On adaptive estimation. Ann. Statist. 10 647–671.
Mathematical Reviews (MathSciNet): MR663424
Digital Object Identifier: doi:10.1214/aos/1176345863
Project Euclid: euclid.aos/1176345863
Zentralblatt MATH: 0489.62033
[5] Box, G. E. P. (1953). Non-normality and tests on variances. Biometrika 40 318–335.
Mathematical Reviews (MathSciNet): MR58937
Zentralblatt MATH: 0051.10805
[6] Cochran, W. G. (1941). The distribution of the largest of a set of estimated variances as a fraction of their total. Ann. Eugenics 11 47–52.
Mathematical Reviews (MathSciNet): MR5560
Zentralblatt MATH: 0063.00936
[7] Conover, W. J., Johnson, M. E. and Johnson, M. M. (1981). Comparative study of tests for homogeneity of variances, with applications to the outer continental shelf bidding data. Technometrics 23 351–361.
[8] Dümbgen, L. (1998). On Tyler’s M-functional of scatter in high dimension. Ann. Inst. Statist. Math. 50 471–491.
[9] Dümbgen, L. and Tyler, D. E. (2005). On the breakdown properties of some multivariate M-functionals. Scand. J. Statist. 32 247–264.
[10] Fligner, M. A. and Killeen, T. J. (1976). Distribution-free two-sample tests for scale. J. Amer. Statist. Assoc. 71 210–213.
Mathematical Reviews (MathSciNet): MR400532
Digital Object Identifier: doi:10.2307/2285771
[11] Goodnight, C. J. and Schwartz, J. M. (1997). A bootstrap comparison of genetic covariance matrices. Biometrics 53 1026–1039.
[12] Gupta, A. K. and Xu, J. (2006). On some tests of the covariance matrix under general conditions. Ann. Inst. Statist. Math. 58 101–114.
Mathematical Reviews (MathSciNet): MR2281208
Digital Object Identifier: doi:10.1007/s10463-005-0010-z
Zentralblatt MATH: 1095.62071
[13] Hájek, I. (1968). Asymptotic normality of simple linear rank statistics under alternatives. Ann. Math. Statist. 39 325–346.
[14] Hallin, M., Oja, H. and Paindaveine, D. (2006). Semiparametrically efficient rank-based inference for shape. II. Optimal R-estimation of shape. Ann. Statist. 34 2757–2789.
Mathematical Reviews (MathSciNet): MR2329466
Digital Object Identifier: doi:10.1214/009053606000000948
Project Euclid: euclid.aos/1179935064
Zentralblatt MATH: 1115.62059
[15] Hallin, M. and Paindaveine, D. (2002). Optimal tests for multivariate location based on interdirections and pseudo-Mahalanobis ranks. Ann. Statist. 30 1103–1133.
Mathematical Reviews (MathSciNet): MR1926170
Digital Object Identifier: doi:10.1214/aos/1031689019
Project Euclid: euclid.aos/1031689019
Zentralblatt MATH: 1101.62348
[16] Hallin, M. and Paindaveine, D. (2004). Rank-based optimal tests of the adequacy of an elliptic VARMA model. Ann. Statist. 32 2642–2678.
Mathematical Reviews (MathSciNet): MR2153998
Digital Object Identifier: doi:10.1214/009053604000000724
Project Euclid: euclid.aos/1107794882
Zentralblatt MATH: 1076.62044
[17] Hallin, M. and Paindaveine, D. (2006). Semiparametrically efficient rank-based inference for shape. I. Optimal rank-based tests for sphericity. Ann. Statist. 34 2707–2756.
Mathematical Reviews (MathSciNet): MR2329465
Digital Object Identifier: doi:10.1214/009053606000000731
Project Euclid: euclid.aos/1179935063
Zentralblatt MATH: 1114.62066
[18] Hallin, M. and Paindaveine, D. (2006). Parametric and semiparametric inference for shape: The role of the scale functional. Statist. Decisions 24 1001–1023.
Mathematical Reviews (MathSciNet): MR2305111
[19] Hallin, M. and Paindaveine, D. (2007). Optimal tests for homogeneity of covariance, scale, and shape. J. Multivariate Anal. To appear.
Mathematical Reviews (MathSciNet): MR2185591
Digital Object Identifier: doi:10.1214/088342304000000602
Project Euclid: euclid.ss/1113832734
Zentralblatt MATH: 1100.62577
[20] Hallin, M. and Werker, B. J. M. (2003). Semiparametric efficiency, distribution-freeness, and invariance. Bernoulli 9 137–165.
Mathematical Reviews (MathSciNet): MR1963675
Digital Object Identifier: doi:10.3150/bj/1068129013
Project Euclid: euclid.bj/1068129013
Zentralblatt MATH: 1020.62042
[21] Hartley, H. O. (1950). The maximum F-ratio as a shortcut test for heterogeneity of variance. Biometrika 37 308–312.
[22] Heritier, S. and Ronchetti, E. (1994). Robust bounded-influence tests in general parametric models. J. Amer. Statist. Assoc. 89 897–904.
Mathematical Reviews (MathSciNet): MR1294733
Digital Object Identifier: doi:10.2307/2290914
Zentralblatt MATH: 0804.62037
[23] Hettmansperger, T. P. and Randles, R. H. (2002). A practical affine equivariant multivariate median. Biometrika 89 851–860.
Mathematical Reviews (MathSciNet): MR1946515
Zentralblatt MATH: 1036.62045
Digital Object Identifier: doi:10.1093/biomet/89.4.851
[24] Jurečková, J. (1969). Asymptotic linearity of a rank statistic in regression parameter. Ann. Math. Statist. 40 1889–1900.
[25] Kreiss, J. P. (1987). On adaptive estimation in stationary ARMA processes. Ann. Statist. 15 112–133.
Mathematical Reviews (MathSciNet): MR885727
Digital Object Identifier: doi:10.1214/aos/1176350256
Project Euclid: euclid.aos/1176350256
Zentralblatt MATH: 0616.62042
[26] Le Cam, L. (1986). Asymptotic Methods in Statistical Decision Theory. Springer, New York.
Mathematical Reviews (MathSciNet): MR856411
Zentralblatt MATH: 0605.62002
[27] Nagao, H. (1973). On some test criteria for covariance matrix. Ann. Statist. 1 700–709.
Mathematical Reviews (MathSciNet): MR339405
Digital Object Identifier: doi:10.1214/aos/1176342464
Project Euclid: euclid.aos/1176342464
Zentralblatt MATH: 0263.62034
[28] Ollila, E., Hettmansperger, T. P. and Oja, H. (2004). Affine equivariant multivariate sign methods. Preprint, Univ. Jyväskylä.
[29] Paindaveine, D. (2006). A Chernoff–Savage result for shape. On the non-admissibility of pseudo-Gaussian methods. J. Multivariate Anal. 97 2206–2220.
Mathematical Reviews (MathSciNet): MR2301635
Digital Object Identifier: doi:10.1016/j.jmva.2005.08.005
Zentralblatt MATH: 1101.62045
[30] Paindaveine, D. (2007). A canonical definition of shape. Submitted.
Mathematical Reviews (MathSciNet): MR2458033
[31] Perlman, M. D. (1980). Unbiasedness of the likelihood ratio tests for equality of several covariance matrices and equality of several multivariate normal populations. Ann. Statist. 8 247–263.
Mathematical Reviews (MathSciNet): MR560727
Digital Object Identifier: doi:10.1214/aos/1176344951
Project Euclid: euclid.aos/1176344951
Zentralblatt MATH: 0427.62029
[32] Puri, M. L. and Sen, P. K. (1985). Nonparametric Methods in General Linear Models. Wiley, New York.
Mathematical Reviews (MathSciNet): MR794309
Zentralblatt MATH: 0569.62024
[33] Randles, R. H. (2000). A simpler, affine-invariant, multivariate, distribution-free sign test. J. Amer. Statist. Assoc. 95 1263–1268.
Mathematical Reviews (MathSciNet): MR1792189
Digital Object Identifier: doi:10.2307/2669766
Zentralblatt MATH: 1009.62047
[34] Salibian-Barrera, M., Van Aelst, S. and Willems, G. (2006). Principal components analysis based on multivariate MM-estimators with fast and robust bootstrap. J. Amer. Statist. Assoc. 101 1198–1211.
Mathematical Reviews (MathSciNet): MR2328307
Digital Object Identifier: doi:10.1198/016214506000000096
Zentralblatt MATH: 1120.62319
[35] Schott, J. R. (2001). Some tests for the equality of covariance matrices. J. Statist. Plann. Inference 94 25–36.
Mathematical Reviews (MathSciNet): MR1820169
Digital Object Identifier: doi:10.1016/S0378-3758(00)00209-3
Zentralblatt MATH: 0971.62031
[36] Taskinen, S., Croux, C., Kankainen, A., Ollila, E. and Oja, H. (2006). Influence functions and efficiencies of the canonical correlation and vector estimates based on scatter and shape matrices. J. Multivariate Anal. 97 359–384.
Mathematical Reviews (MathSciNet): MR2234028
Digital Object Identifier: doi:10.1016/j.jmva.2005.03.005
Zentralblatt MATH: 1085.62078
[37] Tatsuoka, K. S. and Tyler, D. E. (2000). On the uniqueness of S-functionals and M-functionals under nonelliptical distributions. Ann. Statist. 28 1219–1243.
Mathematical Reviews (MathSciNet): MR1811326
Digital Object Identifier: doi:10.1214/aos/1015956714
Project Euclid: euclid.aos/1015956714
Zentralblatt MATH: 1105.62347
[38] Tyler, D. E. (1983). Robustness and efficiency properties of scatter matrices. Biometrika 70 411–420.
Mathematical Reviews (MathSciNet): MR712028
Zentralblatt MATH: 0536.62042
Digital Object Identifier: doi:10.1093/biomet/70.2.411
[39] Tyler, D. E. (1987). A distribution-free M-estimator of multivariate scatter. Ann. Statist. 15 234–251.
Mathematical Reviews (MathSciNet): MR885734
Digital Object Identifier: doi:10.1214/aos/1176350263
Project Euclid: euclid.aos/1176350263
Zentralblatt MATH: 0628.62053
[40] Um, Y. and Randles, R. H. (1998). Nonparametric tests for the multivariate multi-sample location problem. Statist. Sinica 8 801–812.
Mathematical Reviews (MathSciNet): MR1651509
[41] Yanagihara, H., Tonda, T. and Matsumoto, C. (2005). The effects of non-normality on asymptotic distributions of some likelihood ratio criteria for testing covariance structures under normal assumption. J. Multivariate Anal. 96 237–264.
Mathematical Reviews (MathSciNet): MR2204977
Digital Object Identifier: doi:10.1016/j.jmva.2004.10.014
[42] Zhang, J. and Boos, D. D. (1992). Bootstrap critical values for testing homogeneity of covariance matrices. J. Amer. Statist. Assoc. 87 425–429.
Mathematical Reviews (MathSciNet): MR1173807
Digital Object Identifier: doi:10.2307/2290273
Zentralblatt MATH: 0781.62084
[43] Zhu, L. X., Ng, K. W. and Jing, P. (2002). Resampling methods for homogeneity tests of covariance matrices. Statist. Sinica 12 769–783.
Mathematical Reviews (MathSciNet): MR1929963
Zentralblatt MATH: 1005.62047

2013 © Institute of Mathematical Statistics

The Annals of Statistics

The Annals of Statistics

Turn MathJax Off
What is MathJax?