Electronic Journal of Statistics

Dimension reduction for regression estimation with nearest neighbor method

Benoît Cadre and Qian Dong
Source: Electron. J. Statist. Volume 4 (2010), 436-460.

Abstract

In regression with a high-dimensional predictor vector, dimension reduction methods aim at replacing the predictor by a lower dimensional version without loss of information on the regression. In this context, the so-called central mean subspace is the key of dimension reduction. The last two decades have seen the emergence of many methods to estimate the central mean subspace. In this paper, we go one step further, and we study the performances of a k-nearest neighbor type estimate of the regression function, based on an estimator of the central mean subspace. In our setting, the predictor lies in ℝp with fixed p, i.e. it does not depend on the sample size. The estimate is first proved to be consistent. Improvement due to the dimension reduction step is then observed in term of its rate of convergence. All the results are distributions-free. As an application, we give an explicit rate of convergence using the SIR method. The method is illustrated by a simulation study.

First Page: Show Hide
Primary Subjects: 62H12, 62G08
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.ejs/1272632666
Digital Object Identifier: doi:10.1214/09-EJS559
Mathematical Reviews number (MathSciNet): MR2645492

References

[1] Cook, R.D. (1994). On the Interpretation of Regression Plot., Journal of the American Statistical Association 89 177–189.
Mathematical Reviews (MathSciNet): MR1266295
Digital Object Identifier: doi:10.2307/2291214
[2] Cook, R.D. (1996). Graphics for Regression with a Binary Response., Journal of the American Statistical Association 91, 983–992.
Mathematical Reviews (MathSciNet): MR1424601
Zentralblatt MATH: 0882.62060
Digital Object Identifier: doi:10.2307/2291717
[3] Cook, R.D. (1998)., Regression Graphics. Wiley, New-York NY.
Mathematical Reviews (MathSciNet): MR1645673
[4] Cook, R.D. and Li, B. (2002). Dimension Reduction for Conditional Mean in Regression, The Annals of Statistics 30, 455–474.
Mathematical Reviews (MathSciNet): MR1902895
Zentralblatt MATH: 1012.62035
Digital Object Identifier: doi:10.1214/aos/1021379861
Project Euclid: euclid.aos/1021379861
[5] Cook, R.D. and Ni, L. (2005). Sufficient Dimension Reduction via Inverse Regression: A Minimum Discrepancy Approach., Journal of the American Statistical Association 100, 410–428.
Mathematical Reviews (MathSciNet): MR2160547
Zentralblatt MATH: 1117.62312
Digital Object Identifier: doi:10.1198/016214504000001501
[6] Cook, R.D. and Weisberg, S. (1991). Discussion of “Sliced Inverse Regression for Dimension Reduction”., Journal of the American Statistical Association 86, 316–342.
Mathematical Reviews (MathSciNet): MR1137117
Zentralblatt MATH: 0742.62044
Digital Object Identifier: doi:10.2307/2290563
[7] Cook, R.D. and Weisberg, S. (1999). Graphs in Statistical Analysis: Is the Medium the Message?, The American Statistician 53, 29–37.
[8] Györfi, L., Kohler, M., Krzyżak, A. and Walk, H. (2002)., A Distribution-Free Theory of Nonparametric Regression. Springer-Verlag, New-York NY.
[9] Hall, P. and Li, K.C. (1993). On Almost Linearity of Low-Dimensional Projections from High-Dimensional Data., The Annals of Statistics 21, 867–889.
Mathematical Reviews (MathSciNet): MR1232523
Zentralblatt MATH: 0782.62065
Digital Object Identifier: doi:10.1214/aos/1176349155
Project Euclid: euclid.aos/1176349155
[10] Härdle, W. and Stoker, T.M. (1989). Investigating Smooth Multiple Regression by the Method of Average Derivative., Journal of the American Statistical Association 84, 986–995.
Mathematical Reviews (MathSciNet): MR1134488
Zentralblatt MATH: 0703.62052
Digital Object Identifier: doi:10.2307/2290074
[11] Ibragimov, I.A. and Khasminskii, R.Z. (1981)., Statistical Estimation: Asymptotic Theory. Springer-Verlag, New-York NY.
Mathematical Reviews (MathSciNet): MR620321
[12] Kato, T. (1966)., Perturbation Theory for Linear Operators. Springer-Verlag, New-York NY.
Mathematical Reviews (MathSciNet): MR203473
[13] Li, K.C. (1991). Sliced Inverse Regression for Dimension Reduction (with Discussion)., Journal of the American Statistical Association 86, 316–342.
Mathematical Reviews (MathSciNet): MR1137117
Zentralblatt MATH: 0742.62044
Digital Object Identifier: doi:10.2307/2290563
[14] Li, K.C. (1992). On Principal Hessian Directions for Data Visualization and Dimension Reduction: Another Application of Stein’s Lemma., Journal of the American Statistical Association 87, 1025–1039.
Mathematical Reviews (MathSciNet): MR1209564
Zentralblatt MATH: 0765.62003
Digital Object Identifier: doi:10.2307/2290640
[15] Saracco, J. (2005). Asymptotics for Pooled Marginal Slicing Estimator Based on, SIRα. Journal of Multivariate Analysis 96, 117–135.
Mathematical Reviews (MathSciNet): MR2202403
Zentralblatt MATH: 1080.62008
Digital Object Identifier: doi:10.1016/j.jmva.2004.10.003
[16] Xia, Y., Tong, H., Li, W.K. and Zhu, L.-X. (2002). An Adaptative Estimation of Dimension Reduction Space., Journal of the Royal Statistical Society, Ser. B 64, 1–28.
Mathematical Reviews (MathSciNet): MR1924297
Zentralblatt MATH: 1091.62028
Digital Object Identifier: doi:10.1111/1467-9868.03411
[17] Ye, Z. and Weiss, R.E. (2003). Using the Bootstrap to Select One of a New Class of Dimension Reduction Methods., Journal of the American Statistical Association 98, 968–979.
Mathematical Reviews (MathSciNet): MR2041485
Zentralblatt MATH: 1045.62034
Digital Object Identifier: doi:10.1198/016214503000000927
[18] Zhu, L., Miao, B. and Peng, H. (2006). On Sliced Inverse Regression with High-Dimensional Covariates., Journal of the American Statistical Association 101, 630–643.
Mathematical Reviews (MathSciNet): MR2281245
Zentralblatt MATH: 1119.62331
Digital Object Identifier: doi:10.1198/016214505000001285
[19] Zhu, Y. and Zeng P. (2006). Fourier Methods for Estimating the Central Subspace and the Central Mean Subspace in Regression., Journal of the American Statistical Association 101, 1638–1651.
Mathematical Reviews (MathSciNet): MR2279485
Zentralblatt MATH: 1171.62325
Digital Object Identifier: doi:10.1198/016214506000000140

2012 © Institute of Mathematical Statistics

Electronic Journal of Statistics

Electronic Journal of Statistics