The Annals of Statistics

Kernel dimension reduction in regression

Kenji Fukumizu, Francis R. Bach, and Michael I. Jordan
Source: Ann. Statist. Volume 37, Number 4 (2009), 1871-1905.

Abstract

We present a new methodology for sufficient dimension reduction (SDR). Our methodology derives directly from the formulation of SDR in terms of the conditional independence of the covariate X from the response Y, given the projection of X on the central subspace [cf. J. Amer. Statist. Assoc. 86 (1991) 316–342 and Regression Graphics (1998) Wiley]. We show that this conditional independence assertion can be characterized in terms of conditional covariance operators on reproducing kernel Hilbert spaces and we show how this characterization leads to an M-estimator for the central subspace. The resulting estimator is shown to be consistent under weak conditions; in particular, we do not have to impose linearity or ellipticity conditions of the kinds that are generally invoked for SDR methods. We also present empirical results showing that the new methodology is competitive in practice.

First Page: Show Hide
Primary Subjects: 62H99
Secondary Subjects: 62J02
Full-text: Access denied (no subscription detected)
We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.
If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1245332835
Digital Object Identifier: doi:10.1214/08-AOS637
Zentralblatt MATH identifier: 05582013
Mathematical Reviews number (MathSciNet): MR2533474

References

[1] Aronszajn, N. (1950). Theory of reproducing kernels. Trans. Amer. Math. Soc. 68 337–404.
Mathematical Reviews (MathSciNet): MR51437
Zentralblatt MATH: 0037.20701
Digital Object Identifier: doi:10.2307/1990404
[2] Bach, F. R. and Jordan, M. I. (2002). Kernel independent component analysis. J. Mach. Learn. Res. 3 1–48.
Mathematical Reviews (MathSciNet): MR1966051
Zentralblatt MATH: 1088.68689
Digital Object Identifier: doi:10.1162/153244303768966085
[3] Baker, C. R. (1973). Joint measures and cross-covariance operators. Trans. Amer. Math. Soc. 186 273–289.
Mathematical Reviews (MathSciNet): MR336795
Zentralblatt MATH: 0304.28008
Digital Object Identifier: doi:10.2307/1996566
[4] Breiman, L. and Friedman, J. H. (1985). Estimating optimal transformations for multiple regression and correlation. J. Amer. Statist. Assoc. 80 580–598.
Mathematical Reviews (MathSciNet): MR803258
Zentralblatt MATH: 0594.62044
Digital Object Identifier: doi:10.2307/2288473
[5] Chiaromonte, F. and Cook, R. D. (2002). Sufficient dimension reduction and graphics in regression. Ann. Inst. Statist. Math. 54 768–795.
Mathematical Reviews (MathSciNet): MR1954046
Zentralblatt MATH: 1047.62066
Digital Object Identifier: doi:10.1023/A:1022411301790
[6] Cook, R. D. (1998). Regression Graphics. Wiley, New York.
Mathematical Reviews (MathSciNet): MR1645673
[7] Cook, R. D. and Lee, H. (1999). Dimension reduction in regression with a binary response. J. Amer. Statist. Assoc. 94 1187–1200.
Mathematical Reviews (MathSciNet): MR1731482
Zentralblatt MATH: 1072.62619
Digital Object Identifier: doi:10.2307/2669934
[8] Cook, R. D. and Li, B. (2002). Dimension reduction for conditional mean in regression. Ann. Statist. 30 455–474.
Mathematical Reviews (MathSciNet): MR1902895
Zentralblatt MATH: 1012.62035
Digital Object Identifier: doi:10.1214/aos/1021379861
Project Euclid: euclid.aos/1021379861
[9] Cook, R. D. and Weisberg, S. (1991). Discussion of Li. J. Amer. Statist. Assoc. 86 328–332.
[10] Cook, R. D. and Yin, X. (2001). Dimension reduction and visualization in discriminant analysis (with discussion). Aust. N. Z. J. Stat. 43 147–199.
Mathematical Reviews (MathSciNet): MR1839361
[11] Flury, B. and Riedwyl, H. (1988). Multivariate Statistics: A Practical Approach. Chapman and Hall, London.
[12] Friedman, J. H. and Stuetzle, W. (1981). Projection pursuit regression. J. Amer. Statist. Assoc. 76 817–823.
Mathematical Reviews (MathSciNet): MR650892
Digital Object Identifier: doi:10.2307/2287576
[13] Fukumizu, K., Bach, F. R. and Gretton, A. (2007). Statistical consistency of kernel canonical correlation analysis. J. Mach. Learn. Res. 8 361–383.
Mathematical Reviews (MathSciNet): MR2320675
[14] Fukumizu, K., Bach, F. R. and Jordan, M. I. (2004). Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces. J. Mach. Learn. Res. 5 73–99.
Mathematical Reviews (MathSciNet): MR2247974
[15] Fukumizu, K., Gretton, A., Sun, X. and Schölkopf, B. (2008). Kernel measures of conditional dependence. In Advances in Neural Information Processing Systems 20 (J. Platt, D. Koller, Y. Singer and S. Roweis, eds.) 489–496. MIT Press, Cambridge, MA.
[16] Gretton, A., Bousquet, O., Smola, A. J. and Schölkopf, B. (2005). Measuring statistical dependence with Hilbert–Schmidt norms. In 16th International Conference on Algorithmic Learning Theory (S. Jain, H. U. Simon and E. Tomita, eds.) 63–77. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR2255909
Zentralblatt MATH: 1168.62354
Digital Object Identifier: doi:10.1007/11564089_7
[17] Groetsch, C. W. (1984). The Theory of Tikhonov Regularization for Fredholm Equations of the First Kind. Pitman, Boston, MA.
Mathematical Reviews (MathSciNet): MR742928
Zentralblatt MATH: 0545.65034
[18] Hristache, M., Juditsky, A., Polzehl, J. and Spokoiny, V. (2001). Structure adaptive approach for dimension reduction. Ann. Statist. 29 1537–1566.
Mathematical Reviews (MathSciNet): MR1891738
Zentralblatt MATH: 1043.62052
Project Euclid: euclid.aos/1015345954
[19] Kobayashi, S. and Nomizu, K. (1963). Foundations of Differential Geometry, Vol. 1. Wiley, New York.
[20] Lax, P. D. (2002). Functional Analysis. Wiley, New York.
Mathematical Reviews (MathSciNet): MR1892228
Zentralblatt MATH: 1009.47001
[21] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR1102015
Zentralblatt MATH: 0748.60004
[22] Li, B., Zha, H. and Chiaromonte, F. (2005). Contour regression: A general approach to dimension reduction. Ann. Statist. 33 1580–1616.
Mathematical Reviews (MathSciNet): MR2166556
Zentralblatt MATH: 1078.62033
Digital Object Identifier: doi:10.1214/009053605000000192
Project Euclid: euclid.aos/1123250223
[23] Li, K.-C. (1991). Sliced inverse regression for dimension reduction (with discussion). J. Amer. Statist. Assoc. 86 316–342.
Mathematical Reviews (MathSciNet): MR1137117
Zentralblatt MATH: 0742.62044
Digital Object Identifier: doi:10.2307/2290563
[24] Li, K.-C. (1992). On principal Hessian directions for data visualization and dimension reduction: Another application of Stein’s lemma. J. Amer. Statist. Assoc. 87 1025–1039.
Mathematical Reviews (MathSciNet): MR1209564
Zentralblatt MATH: 0765.62003
Digital Object Identifier: doi:10.2307/2290640
[25] Pollard, D. (1984). Convergence of Stochastic Processes. Springer, New York.
Mathematical Reviews (MathSciNet): MR762984
Zentralblatt MATH: 0544.60045
[26] Reed, M. and Simon, B. (1980). Functional Analysis. Academic Press, New York.
Mathematical Reviews (MathSciNet): MR751959
Zentralblatt MATH: 0459.46001
[27] Samarov, A. M. (1993). Exploring regression structure using nonparametric functional estimation. J. Amer. Statist. Assoc. 88 836–847.
Mathematical Reviews (MathSciNet): MR1242934
Zentralblatt MATH: 0790.62035
Digital Object Identifier: doi:10.2307/2290772
[28] Sriperumbudur, B., Gretton, A., Fukumizu, K., Lanckriet, G. and Schölkopf, B. (2008). Injective Hilbert space embeddings of probability measures. In Proceedings of the 21st Annual Conference on Learning Theory (COLT 2008) (R. A. Servedio and T. Zhang, eds.) 111–122. Omnipress, Madison, WI.
[29] Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
Mathematical Reviews (MathSciNet): MR1379242
[30] Vakhania, N. N., Tarieladze, V. I. and Chobanyan, S. A. (1987). Probability Distributions on Banach Spaces. Reidel, Dordrecht.
Mathematical Reviews (MathSciNet): MR1435288
Zentralblatt MATH: 0698.60003
[31] van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Univ. Press, Cambridge.
Mathematical Reviews (MathSciNet): MR1652247
Zentralblatt MATH: 0910.62001
[32] Wahba, G. (1990). Spline Models for Observational Data. CBMS-NSF Regional Conference Series in Applied Mathematics 59. SIAM, Philadelphia, PA.
Mathematical Reviews (MathSciNet): MR1045442
Zentralblatt MATH: 0813.62001
[33] Xia, Y., Tong, H., Li, W. and Zhu, L.-X. (2002). An adaptive estimation of dimension reduction space. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 363–410.
Mathematical Reviews (MathSciNet): MR1924297
Zentralblatt MATH: 1091.62028
Digital Object Identifier: doi:10.1111/1467-9868.03411
[34] Yin, X. and Bura, E. (2006). Moment-based dimension reduction for multivariate response regression. J. Statist. Plann. Inference 136 3675–3688.
Mathematical Reviews (MathSciNet): MR2256281
Zentralblatt MATH: 1093.62058
Digital Object Identifier: doi:10.1016/j.jspi.2005.01.011
[35] Yin, X. and Cook, R. D. (2005). Direction estimation in single-index regressions. Biometrika 92 371–384.
Mathematical Reviews (MathSciNet): MR2201365
Zentralblatt MATH: 1094.62054
Digital Object Identifier: doi:10.1093/biomet/92.2.371
[36] Zhu, Y. and Zeng, P. (2006). Fourier methods for estimating the central subspace and the central mean subspace in regression. J. Amer. Statist. Assoc. 101 1638–1651.
Mathematical Reviews (MathSciNet): MR2279485
Zentralblatt MATH: 1171.62325
Digital Object Identifier: doi:10.1198/016214506000000140

2012 © Institute of Mathematical Statistics

The Annals of Statistics

The Annals of Statistics