Brazilian Journal of Probability and Statistics

A rank-based Cramér–von-Mises-type test for two samples

Jamye Curry, Xin Dang, and Hailin Sang

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

We study a rank based univariate two-sample distribution-free test. The test statistic is the difference between the average of between-group rank distances and the average of within-group rank distances. This test statistic is closely related to the two-sample Cramér–von Mises criterion. They are different empirical versions of a same quantity for testing the equality of two population distributions. Although they may be different for finite samples, they share the same expected value, variance and asymptotic properties. The advantage of the new rank based test over the classical one is its ease to generalize to the multivariate case. Rather than using the empirical process approach, we provide a different easier proof, bringing in a different perspective and insight. In particular, we apply the Hájek projection and orthogonal decomposition technique in deriving the asymptotics of the proposed rank based statistic. A numerical study compares power performance of the rank formulation test with other commonly-used nonparametric tests and recommendations on those tests are provided. Lastly, we propose a multivariate extension of the test based on the spatial rank.

Article information

Source
Braz. J. Probab. Stat., Volume 33, Number 3 (2019), 425-454.

Dates
Received: August 2017
Accepted: February 2018
First available in Project Euclid: 10 June 2019

Permanent link to this document
https://projecteuclid.org/euclid.bjps/1560153846

Digital Object Identifier
doi:10.1214/18-BJPS396

Mathematical Reviews number (MathSciNet)
MR3960270

Zentralblatt MATH identifier
07094811

Keywords
Cramér–von Mises criterion Hájek projection nonparametric test rank two-sample test

Citation

Curry, Jamye; Dang, Xin; Sang, Hailin. A rank-based Cramér–von-Mises-type test for two samples. Braz. J. Probab. Stat. 33 (2019), no. 3, 425--454. doi:10.1214/18-BJPS396. https://projecteuclid.org/euclid.bjps/1560153846


Export citation

References

  • Albers, W., Kallenberg, W. C. M. and Martini, F. (2001). Data-driven rank tests for classes of tail alternatives. Journal of the American Statistical Association 96, 685–696.
  • Anderson, T. W. (1962). On the distribution of the two-sample Cramér–von Mises criterion. The Annals of Mathematical Statistics 33, 1148–1159.
  • Baringhaus, L. and Franz, C. (2004). On a new multivariate two-sample test. Journal of Multivariate Analysis 88, 190–206.
  • Baumgartner, W., Weiß, P. and Schindler, H. (1998). A nonparametric test for the general two sample problem. Biometrics 54, 1129–1135.
  • Borroni, C. G. (2001). Some notes about the nonparametric tests for the equality of two populations. Test 10, 147–159.
  • Cao, R. and Van Keilegom, I. (2006). Empirical likelihood tests for two-sample problems via nonparametric density estimation. Canadian Journal of Statistics 34, 61–77.
  • Chiu, S. and Liu, K. (2009). Generalized Cramér–von Mises goodness-of-fit tests for multivariate distributions. Computational Statistics & Data Analysis 53, 3817–3834.
  • Cotterill, D. and Csörgő, M. (1982). On the limiting distribution of and critical values for the multivariate Cramér–von Mises statistic. The Annals of Statistics 10, 233–244.
  • Darling, D. A. (1957). The Kolomogorov–Smirnov, Cramér–von Mises tests. The Annals of Mathematical Statistics 28, 823–838.
  • Dunford, N. and Schwartz, J. T. (1963). Linear Operators Part II: Spectral Theory. Self Adjoint Operators in Hilbert Space. New York–London: John Wiley & Sons.
  • Efron, B. and Stein, C. (1978). The jackknife estimate of variance. Technical Report No. 40, Division of Biostatistics, Stanford University.
  • Einmahl, J. and McKeague, I. (2003). Empirical likelihood based hypothesis testing. Bernoulli 9, 267–290.
  • Fernández, V., Jimènez Gamerro, M. and Muñoz Garcîa, J. (2008). A test for the two-sample problem based on empirical characteristic functions. Computational Statistics & Data Analysis 52, 3730–3748.
  • Fisz, M. (1960). On a result by M. Rosenblatt concerning the von Mises–Smirnov test. The Annals of Mathematical Statistics 31, 427–429.
  • Genest, C., Quessy, J. F. and Rémillard, B. (2007). Asymptotic local efficiency of Cramér–von Mises tests for multivariate independence. The Annals of Statistics 35, 166–191.
  • Gretton, A., Borgwardt, K. M., Rasch, M. J., Schölkopf, B. and Smola, A. (2008). A kernel method for the two-sample problem. Journal of Machine Learning Research 1, 1–10.
  • Gurevich, G. and Vexler, A. (2011). A two-sample empirical likelihood ratio test based on samples entropy. Statistics and Computing 21, 657–670.
  • Hájek, J. and Šidák, Z. (1967). Theory of Rank Tests. San Diego: Academic Press.
  • Hettmansperger, T. P. and McKean, J. W. (2010). Robust Nonparametric Statistical Methods, 2nd ed. London: Chapman & Hall.
  • Hoeffding, W. (1961). The strong law of large numbers for U-statistics. Inst. Statist. Univ. of North Carolina, Mimeo Report, No. 302.
  • Janic-Wróblewska, A. and Ledwina, T. (2000). Data driven rank test for two-sample problem. Scandinavian Journal of Statistics 27, 281–297.
  • Lehmann, E. L. (1951). Consistency and unbiasedness of certain nonparametric tests. The Annals of Mathematical Statistics 22, 165–179.
  • Möttönen, J., Oja, H. and Tienari, J. (1997). On the efficiency of multivariate spatial sign and rank tests. The Annals of Statistics 25, 542–552.
  • Oja, H. (2010). Multivariate Nonparametric Methods with R: An Approach Based on Spatial Signs and Ranks. New York: Springer.
  • Pettitt, A. N. (1976). A two-sample Anderson–Darling rank statistic. Biometrika 63, 161–168.
  • Rosenblatt, M. (1952). Limit theorems associated with variants of the von Mises statistic. The Annals of Mathematical Statistics 23, 617–623.
  • Schmid, F. and Trede, M. (1995). A distribution free test for the two sample problem for general alternatives. Computational Statistics & Data Analysis 20, 409–419.
  • Serfling, R. (1980). Approximation Theorems of Mathematical Statistics. New York: Wiley.
  • Székely, G. J. and Rizzo, M. L. (2004). Testing for equal distributions in high dimension. InterStat, Nov. 5.
  • Székely, G. J. and Rizzo, M. L. (2013). Energy statistics: A class of statistics based on distances. Journal of Statistical Planning and Inference 143, 1249–1272.
  • Székely, G. J. and Rizzo, M. L. (2017). The energy of data. Annual Review of Statistics and Its Application 4, 447–479.