Abstract
Emerging high-dimensional data sets often contain many nontrivial relationships, and, at modern sample sizes, screening these using an independence test can sometimes yield too many relationships to be a useful exploratory approach. We propose a framework to address this limitation centered around a property of measures of dependence called equitability. Given some measure of relationship strength, an equitable measure of dependence is one that assigns similar scores to equally strong relationships of different types. We formalize equitability within a semiparametric inferential framework in terms of interval estimates of relationship strength, and we then use the correspondence of these interval estimates to hypothesis tests to show that equitability is equivalent under moderate assumptions to requiring that a measure of dependence yield well-powered tests not only for distinguishing nontrivial relationships from trivial ones but also for distinguishing stronger relationships from weaker ones. We then show that equitability, to the extent it is achieved, implies that a statistic will be well powered to detect all relationships of a certain minimal strength, across different relationship types in a family. Thus, equitability is a strengthening of power against independence that enables exploration of data sets with a small number of strong, interesting relationships and a large number of weaker, less interesting ones.
Citation
Yakir A. Reshef. David N. Reshef. Pardis C. Sabeti. Michael Mitzenmacher. "Equitability, Interval Estimation, and Statistical Power." Statist. Sci. 35 (2) 202 - 217, May 2020. https://doi.org/10.1214/19-STS719
Information