Electronic Journal of Statistics

A nonparametric multivariate multisample test based on data depth

Shojaeddin Chenouri and Christopher G. Small

Full-text: Open access

Abstract

In this paper, we construct a family of nonparametric multivariate multisample tests based on depth rankings. These tests are of Kruskal-Wallis type in the sense that the samples are variously ordered. However, unlike the Kruskal-Wallis test, these tests are based upon a depth ranking using a statistical depth function such as the halfspace depth or the Mahalanobis depth, etc. The types of tests we propose are adapted to the depth function that is most appropriate for the application. Under the null hypothesis that all samples come from the same distribution, we show that the test statistic asymptotically has a chi-square distribution. Some comparisons of power are made with the Hotelling T2, and the test of Choi and Marden (1997). Our test is particularly recommended when the data are of unknown distribution type where there is some evidence that the density contours are not elliptical. However, when the data are normally distributed, we often obtain high relative power.

Article information

Source
Electron. J. Statist., Volume 6 (2012), 760-782.

Dates
First available in Project Euclid: 9 May 2012

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1336568105

Digital Object Identifier
doi:10.1214/12-EJS692

Mathematical Reviews number (MathSciNet)
MR2988428

Zentralblatt MATH identifier
1336.62138

Keywords
Data depth multivariate nonparametric tests Kruskal-Wallis test depth-depth plot

Citation

Chenouri, Shojaeddin; Small, Christopher G. A nonparametric multivariate multisample test based on data depth. Electron. J. Statist. 6 (2012), 760--782. doi:10.1214/12-EJS692. https://projecteuclid.org/euclid.ejs/1336568105


Export citation

References

  • Barnett, V. (1976). The ordering of multivariate data., J. Roy. Statist. Soc. Ser. A, 138:318–344.
  • Bennett, B. M. (1962). On multivariate sign tests., J. Roy. Statist. Soc., 24:159–161.
  • Bickel, P. J. (1965). On some asymptotically non-parametric competitors of hotelling’s, t2. Ann. Math. Statist., 36:160–173.
  • Blumen, I. (1958). A new bivariate sign test., J. Amer. Statist. Assoc., 53:448–456.
  • Brown, B. M. (1983). Statistical use of spatial median., J. Roy. Statist. Soc., 45:23–30.
  • Brown, B. M. and Hettmansperger, T. P. (1987). Affine invariant rank methods and the bivariate location model., J. Roy. Statist. Soc., 49:301–310.
  • Brown, B. M. and Hettmansperger, T. P. (1989). An affine invariant version of the sign test., J. Roy. Statist. Soc., 51:117–125.
  • Chakraborty, B., Chaudhuri, P., and Oja, H. (1998). Operating transformation retransformation on spatial median and angle test., Statist. Sinica, 8:767–784.
  • Chatterjee, S. K. (1966). A bivariate sign test for location., Ann. Math. Statist, pages 1771–1780.
  • Chaudhuri, P. and Sengupta, D. (1993). Sign tests in multidimension: inference based on the geometry of the data cloud., J. Amer. Statist. Assoc., 88:1363–1370.
  • Chenouri, S. (2004)., Multivariate robust nonparametric inference based on data depth. PhD thesis, University of Waterloo, Waterloo, ON, CANADA.
  • Chenouri, S., Small, C. G., and Farrar, T. J. (2011). Data depth-based nonparametric scale tests., Canad. J. Statist., 39:356–369.
  • Choi, K. and Marden, J. (1997). An approach to multivariate rank tests in multivariate analysis of variance., J. Amer. Statist. Assoc., 92:1581–1590.
  • Dietz, E. J. (1982). Bivariate nonparametric tests for the one-sample location problem., J. Amer. Statist. Assoc., 77:163–169.
  • Donoho, D. (1982). Breakdown properties of multivariate location estimators., PhD Qualifying paper, Harvard University, Boston.
  • Donoho, D. L. and Gasko, M. (1992). Breakdown properties of location estimates based on halfspace depth and projected outlingness., Ann. Statist, 20:1803–1827.
  • Gower, J. C. (1974). Algorithm as 78: The mediancentre., App. Statist., 23:466–470.
  • Hettmansperger, T. P., Möttönen, J., and Oja, H. (1997). Affine invariant multivariate one sample signed rank test., J. Amer. Statist. Assoc., 92:1591–1600.
  • Hettmansperger, T. P., Möttönen, J., and Oja, H. (1998). Affine invariant multivariate rank tests for several samples., Statist. Sinica, 8:765–800.
  • Hettmansperger, T. P., Nyblom, J., and Oja, H. (1994). Affine invariant multivariate one sample sign tests., J. Roy. Statist. Soc. Ser. B, 56:221–234.
  • Hettmansperger, T. P. and Oja, H. (1994). Affine invariant multivariate multisample sign tests., J. Roy. Statist. Soc. Ser. B, 56:235–249.
  • Hodges, J. L. (1955). A bivariate sign test., Ann. Math. Statist., 26:523–527.
  • Hollander, M. and Wolfe, D. (1999)., Nonparametric Statistical Methods. John Wiley, New York.
  • Hössjer, O. and Croux, C. (1995). Generalizing univariate signed rank statistics for testing and estimating a multivariate location parameter., J. Nonparametric Statist., 4:293–308.
  • Hotelling, H. (1951). A generalized t test and measure of multivariate dispersion., Proceeding of the Second Berkeley Symposium on Mathematical Statistics and Probability, pages 23–41.
  • Johnson, N. L. and Kotz, S. (1972)., Distributions in statistics: continuous multivariate distributions. John Wiley and Sons, New York.
  • Koshevoy, G. (2001). Projections of lift zonoids, the oja depth and the tukey depth., Unpublished manuscript.
  • Kruskal, W. H. (1952). A nonparametric test for the several sample problem., Ann. Math. Statist., 23:525–540.
  • Kruskal, W. H. and Wallis, W. A. (1952). Use of ranks in one criterion variance analysis., J. Amer. Statist. Assoc., 47:583–621.
  • Lawley, D. N. (1938). A generalization of fisher’s, z-test. Biometrika, 30:180–187.
  • Lehmann, E. and D’abrera, H. (2006)., Nonparametrics: statistical methods based on ranks. Springer, New York.
  • Liu, R. and Singh, K. (2006). Rank tests for multivariate scale difference based on data depth., Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications, DIMACS Series, AMS, pages 17–36.
  • Liu, R. Y., Parelius, J. M., and Singh, K. (1999). Multivariate analysis by data depth: Descriptive statistics, graphics and inference (with discussion)., Ann. Statist., 27:783–858.
  • Liu, R. Y. and Singh, K. (1993). A quality index based on data depth and multivariate rank tests., J. Amer. Statist. Assoc., 88:252–260.
  • Mahalanobis, P. C. (1936). On the generalized distance in statistics., Proc. Nat. Acad. India, 12:49–55.
  • Möttönen, J., Hüsler, J., and Oja, H. (2003). Multivariate nonparametric tests in a randomized complete block design., J. Multiv. Analysis, 85:106–129.
  • Möttönen, J. and Oja, H. (1995). Multivariate spatial sign and rank methods., J. Nonparametric Statist., 5:201–213.
  • Oja, H. (1983). Descriptive statistics for multivariate distributions., Statist. Prob. Letters, 1:327–333.
  • Oja, H. (1999). Affine invariant multivariate sign and rank tests and corresponding estimates: a review., Scand. J. Statist., 26:319–343.
  • Oja, H. and Nyblom, J. (1989). Bivariate sign tests., J. Amer. Statist. Assoc., 84:249–259.
  • Peters, D. and Randles, R. H. (1990). A multivariate signed-ranked test for the one-sample location problem., J. Amer. Statist. Assoc., 85:552–557.
  • Peters, D. and Randles, R. H. (1991). A bivariate signed rank test for the two-sample location problem., J. Roy. Statist. Soc. Ser. B, 53:493–504.
  • Puri, M. L. and Sen, P. K. (1971)., Nonparametric methods in multivariate analysis. John Wiley and Sons, New York.
  • Randles, R. H. (1989). A distribution-free multivariate sign test based on interdirections., J. Amer. Statist Assoc., 84:1045–1050.
  • Randles, R. H. (2000). A simpler, affine-invariant, multivariate, distribution-free sign test., J. Amer. Statist. Assoc., 95:1263–1268.
  • Randles, R. H. and Peters, D. (1990). Multivariate rank tests for the two sample location problem., Comm. Statist. Theory Methods, 19:4225–4238.
  • Rao, C. R. (1988). Methodology based on the, l1 norm in statistical inference. Sankhyā Ser. A, 50:289–313.
  • Rousseeuw, P. J. (1983). Multivariate estimation with high breakdown point., Proc. of the 4th pannonian Symp.
  • Rousseeuw, P. J. and Leroy, A. (1987)., Robust Regression and Outlier Detection. Wiley, New York.
  • Rousseeuw, P. J. and Ruts, I. (1996). Bivariate location depth., Applied Statistics, 45:519–526.
  • Rousseeuw, P. J. and Ruts, I. (1998). Constructing the bivariate tukey median., Statist. Sinica, 8:828–839.
  • Rousseeuw, P. J. and Ruts, I. (1999). The depth function of a population distribution., Metrika, 49:213–244.
  • Rousseeuw, P. J. and Struyf, A. (1998). Computing location depth and regression depth in higher dimensions., Statist. Comput., 8:193–203.
  • Ruts, I. and Rousseeuw, P. J. (1996). Computing depth contours of bivariate point clouds., Comput. Statist. data Analysis, 23:153–168.
  • Small, C. G. (1987). Measures of centrality for multivariate and directional distributions., Canad. J. Statist., 15:31–39.
  • Small, C. G. (1990). A survey of multidimensional medians., Intern. Statist. Inst. Rev., 58:263–277.
  • Struyf, A. and Rousseeuw, P. J. (1999). Halfspace depth and regression depth characterize the empirical distribution., J. Multiv. Statist. Analysis., 69:135–153.
  • Tukey, J. W. (1975). Mathematics and picturing data., Proc. Intern. Congr. Math., 2:523–531.
  • Um, Y. and Randles, R. H. (1998). Nonparametric tests for the multivariate multisample location problem., Statist. Sinica, 8:801–812.
  • Zuo, Y. and He, X. (2006). On the limiting distributions of multivariate depth-based rank sum statistics and related tests., Ann. Statist., 34:2879–2896.
  • Zuo, Y. and Serfling, R. (2000). General notions of statistical depth function., Ann. Statist., 28:461–482.