Source: Electron. J. Statist. Volume 4
(2010), 436-460.
In regression with a high-dimensional predictor vector, dimension reduction methods aim at replacing the predictor by a lower dimensional version without loss of information on the regression. In this context, the so-called central mean subspace is the key of dimension reduction. The last two decades have seen the emergence of many methods to estimate the central mean subspace. In this paper, we go one step further, and we study the performances of a k-nearest neighbor type estimate of the regression function, based on an estimator of the central mean subspace. In our setting, the predictor lies in ℝp with fixed p, i.e. it does not depend on the sample size. The estimate is first proved to be consistent. Improvement due to the dimension reduction step is then observed in term of its rate of convergence. All the results are distributions-free. As an application, we give an explicit rate of convergence using the SIR method. The method is illustrated by a simulation study.
References
[1] Cook, R.D. (1994). On the Interpretation of Regression Plot., Journal of the American Statistical Association 89 177–189.
[2] Cook, R.D. (1996). Graphics for Regression with a Binary Response., Journal of the American Statistical Association 91, 983–992.
[3] Cook, R.D. (1998)., Regression Graphics. Wiley, New-York NY.
[4] Cook, R.D. and Li, B. (2002). Dimension Reduction for Conditional Mean in Regression, The Annals of Statistics 30, 455–474.
[5] Cook, R.D. and Ni, L. (2005). Sufficient Dimension Reduction via Inverse Regression: A Minimum Discrepancy Approach., Journal of the American Statistical Association 100, 410–428.
[6] Cook, R.D. and Weisberg, S. (1991). Discussion of “Sliced Inverse Regression for Dimension Reduction”., Journal of the American Statistical Association 86, 316–342.
[7] Cook, R.D. and Weisberg, S. (1999). Graphs in Statistical Analysis: Is the Medium the Message?, The American Statistician 53, 29–37.
[8] Györfi, L., Kohler, M., Krzyżak, A. and Walk, H. (2002)., A Distribution-Free Theory of Nonparametric Regression. Springer-Verlag, New-York NY.
[9] Hall, P. and Li, K.C. (1993). On Almost Linearity of Low-Dimensional Projections from High-Dimensional Data., The Annals of Statistics 21, 867–889.
[10] Härdle, W. and Stoker, T.M. (1989). Investigating Smooth Multiple Regression by the Method of Average Derivative., Journal of the American Statistical Association 84, 986–995.
[11] Ibragimov, I.A. and Khasminskii, R.Z. (1981)., Statistical Estimation: Asymptotic Theory. Springer-Verlag, New-York NY.
Mathematical Reviews (MathSciNet):
MR620321
[12] Kato, T. (1966)., Perturbation Theory for Linear Operators. Springer-Verlag, New-York NY.
Mathematical Reviews (MathSciNet):
MR203473
[13] Li, K.C. (1991). Sliced Inverse Regression for Dimension Reduction (with Discussion)., Journal of the American Statistical Association 86, 316–342.
[14] Li, K.C. (1992). On Principal Hessian Directions for Data Visualization and Dimension Reduction: Another Application of Stein’s Lemma., Journal of the American Statistical Association 87, 1025–1039.
[15] Saracco, J. (2005). Asymptotics for Pooled Marginal Slicing Estimator Based on, SIRα. Journal of Multivariate Analysis 96, 117–135.
[16] Xia, Y., Tong, H., Li, W.K. and Zhu, L.-X. (2002). An Adaptative Estimation of Dimension Reduction Space., Journal of the Royal Statistical Society, Ser. B 64, 1–28.
[17] Ye, Z. and Weiss, R.E. (2003). Using the Bootstrap to Select One of a New Class of Dimension Reduction Methods., Journal of the American Statistical Association 98, 968–979.
[18] Zhu, L., Miao, B. and Peng, H. (2006). On Sliced Inverse Regression with High-Dimensional Covariates., Journal of the American Statistical Association 101, 630–643.
[19] Zhu, Y. and Zeng P. (2006). Fourier Methods for Estimating the Central Subspace and the Central Mean Subspace in Regression., Journal of the American Statistical Association 101, 1638–1651.