Electronic Journal of Statistics

On inference validity of weighted U-statistics under data heterogeneity

Fang Han and Tianchen Qian

Full-text: Open access

Abstract

Motivated by challenges on studying a new correlation measurement being popularized in evaluating online ranking algorithms’ performance, this manuscript explores the validity of uncertainty assessment for weighted U-statistics. Without any commonly adopted assumption, we verify Efron’s bootstrap and a new resampling procedure’s inference validity. Specifically, in its full generality, our theory allows both kernels and weights asymmetric and data points not identically distributed, which are all new issues that historically have not been addressed. For achieving strict generalization, for example, we have to carefully control the order of the “degenerate” term in U-statistics which are no longer degenerate under the empirical measure for non-i.i.d. data. Our result applies to the motivating task, giving the region at which solid statistical inference can be made.

Article information

Source
Electron. J. Statist., Volume 12, Number 2 (2018), 2637-2708.

Dates
Received: August 2017
First available in Project Euclid: 31 August 2018

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1535681029

Digital Object Identifier
doi:10.1214/18-EJS1462

Subjects
Primary: 62E20: Asymptotic distribution theory

Keywords
Weighted U-statistics nondegeneracy bootstrap inference data heterogeneity rank correlation average-precision correlation

Rights
Creative Commons Attribution 4.0 International License.

Citation

Han, Fang; Qian, Tianchen. On inference validity of weighted U-statistics under data heterogeneity. Electron. J. Statist. 12 (2018), no. 2, 2637--2708. doi:10.1214/18-EJS1462. https://projecteuclid.org/euclid.ejs/1535681029


Export citation

References

  • [1] Bickel, P. J. and Freedman, D. A. (1981). Some asymptotic theory for the bootstrap., The Annals of Statistics 9 1196–1217.
  • [2] Bickel, P. J., Götze, F. and van Zwet, W. R. (1997). Resampling fewer than $n$ observations: gains, losses, and remedies for losses., Statistica Sinica 7 1–31.
  • [3] Carlstein, E., Do, K.-A., Hall, P., Hesterberg, T. and Künsch, H. R. (1998). Matched-block bootstrap for dependent data., Bernoulli 4 305–328.
  • [4] Csörgő, M. and Nasari, M. M. (2013). Asymptotics of Randomly Weighted U- and V-statistics: application to Bootstrap., Journal of Multivariate Analysis 121 176–192.
  • [5] Dahlhaus, R. (1997). Fitting time series models to nonstationary processes., The Annals of Statistics 25 1–37.
  • [6] Dehling, H. and Wendler, M. (2010). Central limit theorem and the bootstrap for U-statistics of strongly mixing data., Journal of Multivariate Analysis 101 126–137.
  • [7] Efron, B. (1979). Bootstrap Methods: Another Look at the Jackknife., The Annals of Statistics 7 1–26.
  • [8] Fitzenberger, B. (1998). The moving blocks bootstrap and robust inference for linear least squares and quantile regressions., Journal of Econometrics 82 235–287.
  • [9] Gonçalves, S. and White, H. (2002). The bootstrap of the mean for dependent heterogeneous arrays., Econometric Theory 18 1367–1384.
  • [10] Grams, W. F. and Serfling, R. J. (1973). Convergence Rates for U-Statistics and Related Statistics., The Annals of Statistics 1 153–160.
  • [11] Hall, P. (1992)., The Bootstrap and Edgeworth Expansion. Springer.
  • [12] Hoeffding, W. (1948). A class of statistics with asymptotically normal distribution., The Annals of Mathematical Statistics 19 293–325.
  • [13] Hsing, T. and Wu, W. B. (2004). On weighted U-statistics for stationary processes., The Annals of Probability 32 1600–1631.
  • [14] Kendall, M. G. (1938). A new measure of rank correlation., Biometrika 30 81–93.
  • [15] Kendall, M. G. and Stuart, A. (1973)., The Advanced Theory of Statistics 2. Charles Griffin.
  • [16] Korolyuk, V. S. and Borovskich, Y. V. (2013)., Theory of U-Statistics. Springer.
  • [17] Kreiss, J.-P. and Paparoditis, E. (2015). Bootstrapping locally stationary processes., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 77 267–290.
  • [18] Kunsch, H. R. (1989). The jackknife and the bootstrap for general stationary observations., The Annals of Statistics 17 1196–1217.
  • [19] Lahiri, S. N. (1993). On the moving block bootstrap under long range dependence., Statistics and Probability Letters 18 405–413.
  • [20] Lee, J. (1990)., U-Statistics: Theory and Practice. CRC Press.
  • [21] Lehmann, E. L. (1999)., Elements of Large-Sample Theory. Springer.
  • [22] Liu, R. Y. (1988). Bootstrap procedures under some non-i.i.d. models., The Annals of Statistics 16 1696–1708.
  • [23] Liu, R. Y. and Singh, K. (1995). Using i.i.d. bootstrap inference for general non-i.i.d. models., Journal of Statistical Planning and Inference 43 67–75.
  • [24] Major, P. (1994). Asymptotic distributions for weighted U-statistics., The Annals of Probability 21 1514–1535.
  • [25] Mammen, E. (2012)., When Does Bootstrap Work? Asymptotic Results and Simulations. Springer.
  • [26] Mikosch, T. (1999). Regular Variation, Subexponentiality and Their Applications in Probability Theory Technical Report, Eindhoven University of, Technology.
  • [27] O’Neil, K. A. and Redner, R. A. (1993). Asymptotic distributions of weighted U-statistics of degree 2., The Annals of Probability 21 1159–1169.
  • [28] Paparoditis, E. and Politis, D. N. (2001). Tapered block bootstrap., Biometrika 88 1105–1119.
  • [29] Paparoditis, E. and Politis, D. N. (2002). Local block bootstrap., Comptes Rendus Mathematique 335 959–962.
  • [30] Politis, D. N. and Romano, J. P. (1992). A circular block-resampling procedure for stationary data. In, Exploring the Limits of Bootstrap 263–270. John Wiley, New York.
  • [31] Politis, D. N. and Romano, J. P. (1994). Large sample confidence regions based on subsamples under minimal assumptions., The Annals of Statistics 22 2031–2050.
  • [32] Politis, D. N., Romano, J. P. and Wolf, M. (1999)., Subsampling. Springer.
  • [33] Resnick, S. I. (2007)., Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. Springer.
  • [34] Rifi, M. and Utzet, F. (2000). On the asymptotic behavior of weighted U-statistics., Journal of Theoretical Probability 13 141–167.
  • [35] Sen, P. K. (1968). Estimates of the regression coefficient based on Kendall’s tau., Journal of the American Statistical Association 63 1379–1389.
  • [36] Serfling, R. J. (1980)., Approximation Theorems of Mathematical Statistics. John Wiley and Sons.
  • [37] Shao, X. (2010). The dependent wild bootstrap., Journal of the American Statistical Association 105 218–235.
  • [38] Shapiro, C. P. and Hubert, L. (1979). Asymptotic normality of permutation statistics derived from weighted sums of bivariate functions., The Annals of Statistics 7 788–794.
  • [39] Thode, H. C. (2002)., Testing for Normality. CRC Press.
  • [40] Yilmaz, E., Aslam, J. A. and Robertson, S. (2008). A new rank correlation coefficient for information retrieval. In, Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 587–594.
  • [41] Yoshihara, K.-i. (1976). Limiting behavior of U-statistics for stationary, absolutely regular processes., Probability Theory and Related Fields 35 237–252.
  • [42] Zhou, Z. (2014). Inference of weighted V-statistics for nonstationary time series and its applications., The Annals of Statistics 42 87–114.