Open Access
2018 A nearest neighbor estimate of the residual variance
Luc Devroye, László Györfi, Gábor Lugosi, Harro Walk
Electron. J. Statist. 12(1): 1752-1778 (2018). DOI: 10.1214/18-EJS1438

Abstract

We study the problem of estimating the smallest achievable mean-squared error in regression function estimation. The problem is equivalent to estimating the second moment of the regression function of $Y$ on $X\in{\mathbb{R}} ^{d}$. We introduce a nearest-neighbor-based estimate and obtain a normal limit law for the estimate when $X$ has an absolutely continuous distribution, without any condition on the density. We also compute the asymptotic variance explicitly and derive a non-asymptotic bound on the variance that does not depend on the dimension $d$. The asymptotic variance does not depend on the smoothness of the density of $X$ or of the regression function. A non-asymptotic exponential concentration inequality is also proved. We illustrate the use of the new estimate through testing whether a component of the vector $X$ carries information for predicting $Y$.

Citation

Download Citation

Luc Devroye. László Györfi. Gábor Lugosi. Harro Walk. "A nearest neighbor estimate of the residual variance." Electron. J. Statist. 12 (1) 1752 - 1778, 2018. https://doi.org/10.1214/18-EJS1438

Information

Received: 1 June 2017; Published: 2018
First available in Project Euclid: 6 June 2018

zbMATH: 06886383
MathSciNet: MR3811758
Digital Object Identifier: 10.1214/18-EJS1438

Subjects:
Primary: 62G08
Secondary: 62G20

Keywords: asymptotic normality , Concentration inequalities , Dimension reduction , Nearest-neighbor-based estimate , Regression functional

Vol.12 • No. 1 • 2018
Back to Top