The Annals of Applied Statistics

DISCO analysis: A nonparametric extension of analysis of variance

Maria L. Rizzo and Gábor J. Székely

Full-text: Open access


In classical analysis of variance, dispersion is measured by considering squared distances of sample elements from the sample mean. We consider a measure of dispersion for univariate or multivariate response based on all pairwise distances between-sample elements, and derive an analogous distance components (DISCO) decomposition for powers of distance in (0, 2]. The ANOVA F statistic is obtained when the index (exponent) is 2. For each index in (0, 2), this decomposition determines a nonparametric test for the multi-sample hypothesis of equal distributions that is statistically consistent against general alternatives.

Article information

Ann. Appl. Stat., Volume 4, Number 2 (2010), 1034-1055.

First available in Project Euclid: 3 August 2010

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Distance components DISCO multisample problem test equal distributions multivariate nonparametric MANOVA extension


Rizzo, Maria L.; Székely, Gábor J. DISCO analysis: A nonparametric extension of analysis of variance. Ann. Appl. Stat. 4 (2010), no. 2, 1034--1055. doi:10.1214/09-AOAS245.

Export citation


  • Akritas, M. G. and Arnold, S. F. (1994). Fully nonparametric hypotheses for factorial designs. I. Multivariate repeated measures designs. J. Amer. Statist. Assoc. 89 336–343.
  • Anderson, M. J. (2001). A new method for non-parametric multivariate analysis of variance. Austral. Ecology 26 32–46.
  • Anderson, T. W. (1984). An Introduction to Multivariate Statistical Analysis, 2nd ed. Wiley, New York.
  • Brunner, E. and Puri, M. L. (2001). Nonparametric methods in factorial designs. Statist. Papers 42 1–52.
  • Canty, A. and Ripley, B. (2009). boot: Bootstrap R (S-Plus) Functions. R package version 1.2-35.
  • Cochran, W. G. and Cox, G. M. (1957). Experimental Designs, 2nd ed. Wiley, New York.
  • Davison, A. C. and Hinkley, D. V. (1997). Bootstrap Methods and Their Application. Cambridge Univ. Press, Oxford.
  • Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman & Hall/CRC, Boca Raton, FL.
  • Excoffier, L. Smouse, P. E. and Quattro, J. M. (1992). Analysis of molecular variance inferred from metric distances among DNA haplotypes: Application to human mitochondrial DNA restriction data. Genetics 131 479–491.
  • Gower, J. C. and Krzanowski, W. J. (1999). Analysis of distance for structured multivariate data and extensions to multivariate analysis of variance. J. Roy. Statist. Soc. C 48 505–519.
  • Hand, D. J. and Taylor, C. C. (1987). Multivariate Analysis of Variance and Repeated Measures. Chapman and Hall, New York.
  • Hollander, M. and Wolfe, D. A. (1999). Nonparametric Statistical Methods, 2nd ed. Wiley, New York.
  • Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). Multivariate Analysis. Academic Press, San Diego, CA.
  • McArdle, B. H. and Anderson, M. J. (2001). Fitting multivariate models to community data: A comment on distance-based redundancy analysis. Ecology 82 290–297.
  • Pillai, K. C. S. (1955). Some new test criteria in multivariate analysis. Ann. Math. Statist. 26 117–121.
  • R Development Core Team (2009). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. Available at ISBN 3-900051-07-0.
  • Rizzo, M. L. and Székely, G. J. (2009). disco: Distance components. R package version 0.1-0.
  • Scheffé, H. (1953). Analysis of Variance. Wiley, New York.
  • Searle, S. R., Casella, G. and McCulloch, C. E. (1992). Variance Components. Wiley, New York.
  • Serfling, R. J. (1980). Approximation Theorems of Mathematical Statistics. Wiley, New York.
  • Székely, G. J. and Bakirov, N. K. (2003). Extremal probabilities for Gaussian quadratic forms. Probab. Theory Related Fields 126 184–202.
  • Székely, G. J. and Rizzo, M. L. (2005a). A new test for multivariate normality. J. Multivariate Anal. 93 58–80.
  • Székely, G. J. and Rizzo, M. L. (2005b). Hierarchical clustering via joint between-within distances: Extending Ward’s minimum variance method. J. Classification 22 151–183.
  • Wilks, S. S. (1932). Certain generalizations in the analysis of variance. Biometrika 24 471–494.
  • Zapala, M. A. and Schork, N. J. (2006). Multivariate regression analysis of distance matrices for testing associations between gene expression patterns and related variables. Proc. Natl. Acad. Sci. USA 103 19430–19435.