Open Access
December 2020 Distance-based and RKHS-based dependence metrics in high dimension
Changbo Zhu, Xianyang Zhang, Shun Yao, Xiaofeng Shao
Ann. Statist. 48(6): 3366-3394 (December 2020). DOI: 10.1214/19-AOS1934

Abstract

In this paper, we study distance covariance, Hilbert–Schmidt covariance (aka Hilbert–Schmidt independence criterion [In Advances in Neural Information Processing Systems (2008) 585–592]) and related independence tests under the high dimensional scenario. We show that the sample distance/Hilbert–Schmidt covariance between two random vectors can be approximated by the sum of squared componentwise sample cross-covariances up to an asymptotically constant factor, which indicates that the standard distance/Hilbert–Schmidt covariance based test can only capture linear dependence in high dimension. Under the assumption that the components within each high dimensional vector are weakly dependent, the distance correlation based $t$ test developed by Székely and Rizzo (J. Multivariate Anal. 117 (2013) 193–213) for independence is shown to have trivial limiting power when the two random vectors are nonlinearly dependent but component-wisely uncorrelated. This new and surprising phenomenon, which seems to be discovered and carefully studied for the first time, is further confirmed in our simulation study. As a remedy, we propose tests based on an aggregation of marginal sample distance/Hilbert–Schmidt covariances and show their superior power behavior against their joint counterparts in simulations. We further extend the distance correlation based $t$ test to those based on Hilbert–Schmidt covariance and marginal distance/Hilbert–Schmidt covariance. A novel unified approach is developed to analyze the studentized sample distance/Hilbert–Schmidt covariance as well as the studentized sample marginal distance covariance under both null and alternative hypothesis. Our theoretical and simulation results shed light on the limitation of distance/Hilbert–Schmidt covariance when used jointly in the high dimensional setting and suggest the aggregation of marginal distance/Hilbert–Schmidt covariance as a useful alternative.

Citation

Download Citation

Changbo Zhu. Xianyang Zhang. Shun Yao. Xiaofeng Shao. "Distance-based and RKHS-based dependence metrics in high dimension." Ann. Statist. 48 (6) 3366 - 3394, December 2020. https://doi.org/10.1214/19-AOS1934

Information

Received: 1 April 2019; Revised: 1 November 2019; Published: December 2020
First available in Project Euclid: 11 December 2020

Digital Object Identifier: 10.1214/19-AOS1934

Subjects:
Primary: 60K35 , 62G10
Secondary: 62G20

Keywords: $\mathcal{U}$-statistics , distance covariance , high dimensionality , Hilbert–Schmidt independence criterion , Independence test

Rights: Copyright © 2020 Institute of Mathematical Statistics

Vol.48 • No. 6 • December 2020
Back to Top