The Annals of Statistics

Measuring and testing dependence by correlation of distances

Gábor J. Székely, Maria L. Rizzo, and Nail K. Bakirov

Full-text: Open access

Abstract

Distance correlation is a new measure of dependence between random vectors. Distance covariance and distance correlation are analogous to product-moment covariance and correlation, but unlike the classical definition of correlation, distance correlation is zero only if the random vectors are independent. The empirical distance dependence measures are based on certain Euclidean distances between sample elements rather than sample moments, yet have a compact representation analogous to the classical covariance and correlation. Asymptotic properties and applications in testing independence are discussed. Implementation of the test and Monte Carlo results are also presented.

Article information

Source
Ann. Statist. Volume 35, Number 6 (2007), 2769-2794.

Dates
First available in Project Euclid: 22 January 2008

Permanent link to this document
http://projecteuclid.org/euclid.aos/1201012979

Digital Object Identifier
doi:10.1214/009053607000000505

Mathematical Reviews number (MathSciNet)
MR2382665

Zentralblatt MATH identifier
1129.62059

Subjects
Primary: 62G10: Hypothesis testing
Secondary: 62H20: Measures of association (correlation, canonical correlation, etc.)

Keywords
Distance correlation distance covariance multivariate independence

Citation

Székely, Gábor J.; Rizzo, Maria L.; Bakirov, Nail K. Measuring and testing dependence by correlation of distances. Ann. Statist. 35 (2007), no. 6, 2769--2794. doi:10.1214/009053607000000505. http://projecteuclid.org/euclid.aos/1201012979.


Export citation

References

  • Albert, P. S., Ratnasinghe, D., Tangrea, J. and Wacholder, S. (2001). Limitations of the case-only design for identifying gene-environment interactions. Amer. J. Epidemiol. 154 687–693.
  • Bakirov, N. K., Rizzo, M. L. and Székely, G. J. (2006). A multivariate nonparametric test of independence. J. Multivariate Anal. 97 1742–1756.
  • Eaton, M. L. (1989). Group Invariance Applications in Statistics. IMS, Hayward, CA.
  • Giri, N. C. (1996). Group Invariance in Statistical Inference. World Scientific, River Edge, NJ.
  • Kuo, H. H. (1975). Gaussian Measures in Banach Spaces. Lecture Notes in Math. 463. Springer, Berlin.
  • Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). Multivariate Analysis. Academic Press, London.
  • Potvin, C. and Roff, D. A. (1993). Distribution-free and robust statistical methods: Viable alternatives to parametric statistics? Ecology 74 1617–1628.
  • Puri, M. L. and Sen, P. K. (1971). Nonparametric Methods in Multivariate Analysis. Wiley, New York.
  • Székely, G. J. and Bakirov, N. K. (2003). Extremal probabilities for Gaussian quadratic forms. Probab. Theory Related Fields 126 184–202.
  • Székely, G. J. and Rizzo, M. L. (2005). A new test for multivariate normality. J. Multivariate Anal. 93 58–80.
  • Székely, G. J. and Rizzo, M. L. (2005). Hierarchical clustering via joint between-within distances: Extending Ward's minimum variance method. J. Classification 22 151–183.
  • Tracz, S. M., Elmore, P. B. and Pohlmann, J. T. (1992). Correlational meta-analysis: Independent and nonindependent cases. Educational and Psychological Measurement 52 879–888.
  • von Mises, R. (1947). On the asymptotic distribution of differentiable statistical functionals. Ann. Math. Statist. 18 309–348.
  • Wilks, S. S. (1935). On the independence of $k$ sets of normally distributed statistical variables. Econometrica 3 309–326.