The Annals of Statistics

Kernel dimension reduction in regression

Kenji Fukumizu, Francis R. Bach, and Michael I. Jordan

Source: Ann. Statist. Volume 37, Number 4 (2009), 1871-1905.

Abstract

We present a new methodology for sufficient dimension reduction (SDR). Our methodology derives directly from the formulation of SDR in terms of the conditional independence of the covariate X from the response Y, given the projection of X on the central subspace [cf. J. Amer. Statist. Assoc. 86 (1991) 316–342 and Regression Graphics (1998) Wiley]. We show that this conditional independence assertion can be characterized in terms of conditional covariance operators on reproducing kernel Hilbert spaces and we show how this characterization leads to an M-estimator for the central subspace. The resulting estimator is shown to be consistent under weak conditions; in particular, we do not have to impose linearity or ellipticity conditions of the kinds that are generally invoked for SDR methods. We also present empirical results showing that the new methodology is competitive in practice.

Primary Subjects: 62H99
Secondary Subjects: 62J02
Keywords: Dimension reduction; regression; positive definite kernel; reproducing kernel; consistency

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.
If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1245332835
Digital Object Identifier: doi:10.1214/08-AOS637
Zentralblatt MATH identifier: 05582013
Mathematical Reviews number (MathSciNet): MR2533474

References

[1] Aronszajn, N. (1950). Theory of reproducing kernels. Trans. Amer. Math. Soc. 68 337–404.
Mathematical Reviews (MathSciNet): MR51437
Digital Object Identifier: doi:10.2307/1990404
[2] Bach, F. R. and Jordan, M. I. (2002). Kernel independent component analysis. J. Mach. Learn. Res. 3 1–48.
Mathematical Reviews (MathSciNet): MR1966051
Digital Object Identifier: doi:10.1162/153244303768966085
[3] Baker, C. R. (1973). Joint measures and cross-covariance operators. Trans. Amer. Math. Soc. 186 273–289.
Mathematical Reviews (MathSciNet): MR336795
Digital Object Identifier: doi:10.2307/1996566
[4] Breiman, L. and Friedman, J. H. (1985). Estimating optimal transformations for multiple regression and correlation. J. Amer. Statist. Assoc. 80 580–598.
Mathematical Reviews (MathSciNet): MR803258
Digital Object Identifier: doi:10.2307/2288473
[5] Chiaromonte, F. and Cook, R. D. (2002). Sufficient dimension reduction and graphics in regression. Ann. Inst. Statist. Math. 54 768–795.
Mathematical Reviews (MathSciNet): MR1954046
Digital Object Identifier: doi:10.1023/A:1022411301790
[6] Cook, R. D. (1998). Regression Graphics. Wiley, New York.
Mathematical Reviews (MathSciNet): MR1645673
[7] Cook, R. D. and Lee, H. (1999). Dimension reduction in regression with a binary response. J. Amer. Statist. Assoc. 94 1187–1200.
Mathematical Reviews (MathSciNet): MR1731482
Digital Object Identifier: doi:10.2307/2669934
[8] Cook, R. D. and Li, B. (2002). Dimension reduction for conditional mean in regression. Ann. Statist. 30 455–474.
Mathematical Reviews (MathSciNet): MR1902895
Digital Object Identifier: doi:10.1214/aos/1021379861
Project Euclid: euclid.aos/1021379861
[9] Cook, R. D. and Weisberg, S. (1991). Discussion of Li. J. Amer. Statist. Assoc. 86 328–332.
[10] Cook, R. D. and Yin, X. (2001). Dimension reduction and visualization in discriminant analysis (with discussion). Aust. N. Z. J. Stat. 43 147–199.
Mathematical Reviews (MathSciNet): MR1839361
[11] Flury, B. and Riedwyl, H. (1988). Multivariate Statistics: A Practical Approach. Chapman and Hall, London.
[12] Friedman, J. H. and Stuetzle, W. (1981). Projection pursuit regression. J. Amer. Statist. Assoc. 76 817–823.
Mathematical Reviews (MathSciNet): MR650892
Digital Object Identifier: doi:10.2307/2287576
[13] Fukumizu, K., Bach, F. R. and Gretton, A. (2007). Statistical consistency of kernel canonical correlation analysis. J. Mach. Learn. Res. 8 361–383.
Mathematical Reviews (MathSciNet): MR2320675
[14] Fukumizu, K., Bach, F. R. and Jordan, M. I. (2004). Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces. J. Mach. Learn. Res. 5 73–99.
Mathematical Reviews (MathSciNet): MR2247974
[15] Fukumizu, K., Gretton, A., Sun, X. and Schölkopf, B. (2008). Kernel measures of conditional dependence. In Advances in Neural Information Processing Systems 20 (J. Platt, D. Koller, Y. Singer and S. Roweis, eds.) 489–496. MIT Press, Cambridge, MA.
[16] Gretton, A., Bousquet, O., Smola, A. J. and Schölkopf, B. (2005). Measuring statistical dependence with Hilbert–Schmidt norms. In 16th International Conference on Algorithmic Learning Theory (S. Jain, H. U. Simon and E. Tomita, eds.) 63–77. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR2255909
Digital Object Identifier: doi:10.1007/11564089_7
[17] Groetsch, C. W. (1984). The Theory of Tikhonov Regularization for Fredholm Equations of the First Kind. Pitman, Boston, MA.
Mathematical Reviews (MathSciNet): MR742928
[18] Hristache, M., Juditsky, A., Polzehl, J. and Spokoiny, V. (2001). Structure adaptive approach for dimension reduction. Ann. Statist. 29 1537–1566.
Mathematical Reviews (MathSciNet): MR1891738
Project Euclid: euclid.aos/1015345954
[19] Kobayashi, S. and Nomizu, K. (1963). Foundations of Differential Geometry, Vol. 1. Wiley, New York.
[20] Lax, P. D. (2002). Functional Analysis. Wiley, New York.
Mathematical Reviews (MathSciNet): MR1892228
[21] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR1102015
[22] Li, B., Zha, H. and Chiaromonte, F. (2005). Contour regression: A general approach to dimension reduction. Ann. Statist. 33 1580–1616.
Mathematical Reviews (MathSciNet): MR2166556
Digital Object Identifier: doi:10.1214/009053605000000192
Project Euclid: euclid.aos/1123250223
[23] Li, K.-C. (1991). Sliced inverse regression for dimension reduction (with discussion). J. Amer. Statist. Assoc. 86 316–342.
Mathematical Reviews (MathSciNet): MR1137117
Digital Object Identifier: doi:10.2307/2290563
[24] Li, K.-C. (1992). On principal Hessian directions for data visualization and dimension reduction: Another application of Stein’s lemma. J. Amer. Statist. Assoc. 87 1025–1039.
Mathematical Reviews (MathSciNet): MR1209564
Digital Object Identifier: doi:10.2307/2290640
[25] Pollard, D. (1984). Convergence of Stochastic Processes. Springer, New York.
Mathematical Reviews (MathSciNet): MR762984
[26] Reed, M. and Simon, B. (1980). Functional Analysis. Academic Press, New York.
Mathematical Reviews (MathSciNet): MR751959
[27] Samarov, A. M. (1993). Exploring regression structure using nonparametric functional estimation. J. Amer. Statist. Assoc. 88 836–847.
Mathematical Reviews (MathSciNet): MR1242934
Digital Object Identifier: doi:10.2307/2290772
[28] Sriperumbudur, B., Gretton, A., Fukumizu, K., Lanckriet, G. and Schölkopf, B. (2008). Injective Hilbert space embeddings of probability measures. In Proceedings of the 21st Annual Conference on Learning Theory (COLT 2008) (R. A. Servedio and T. Zhang, eds.) 111–122. Omnipress, Madison, WI.
[29] Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
Mathematical Reviews (MathSciNet): MR1379242
[30] Vakhania, N. N., Tarieladze, V. I. and Chobanyan, S. A. (1987). Probability Distributions on Banach Spaces. Reidel, Dordrecht.
Mathematical Reviews (MathSciNet): MR1435288
[31] van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Univ. Press, Cambridge.
Mathematical Reviews (MathSciNet): MR1652247
[32] Wahba, G. (1990). Spline Models for Observational Data. CBMS-NSF Regional Conference Series in Applied Mathematics 59. SIAM, Philadelphia, PA.
Mathematical Reviews (MathSciNet): MR1045442
[33] Xia, Y., Tong, H., Li, W. and Zhu, L.-X. (2002). An adaptive estimation of dimension reduction space. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 363–410.
Mathematical Reviews (MathSciNet): MR1924297
Digital Object Identifier: doi:10.1111/1467-9868.03411
[34] Yin, X. and Bura, E. (2006). Moment-based dimension reduction for multivariate response regression. J. Statist. Plann. Inference 136 3675–3688.
Mathematical Reviews (MathSciNet): MR2256281
Digital Object Identifier: doi:10.1016/j.jspi.2005.01.011
[35] Yin, X. and Cook, R. D. (2005). Direction estimation in single-index regressions. Biometrika 92 371–384.
Mathematical Reviews (MathSciNet): MR2201365
Digital Object Identifier: doi:10.1093/biomet/92.2.371
[36] Zhu, Y. and Zeng, P. (2006). Fourier methods for estimating the central subspace and the central mean subspace in regression. J. Amer. Statist. Assoc. 101 1638–1651.
Mathematical Reviews (MathSciNet): MR2279485
Digital Object Identifier: doi:10.1198/016214506000000140

2009 © Institute of Mathematical Statistics