## The Annals of Statistics

### A general theory for nonlinear sufficient dimension reduction: Formulation and estimation

#### Abstract

In this paper we introduce a general theory for nonlinear sufficient dimension reduction, and explore its ramifications and scope. This theory subsumes recent work employing reproducing kernel Hilbert spaces, and reveals many parallels between linear and nonlinear sufficient dimension reduction. Using these parallels we analyze the properties of existing methods and develop new ones. We begin by characterizing dimension reduction at the general level of $\sigma$-fields and proceed to that of classes of functions, leading to the notions of sufficient, complete and central dimension reduction classes. We show that, when it exists, the complete and sufficient class coincides with the central class, and can be unbiasedly and exhaustively estimated by a generalized sliced inverse regression estimator (GSIR). When completeness does not hold, this estimator captures only part of the central class. However, in these cases we show that a generalized sliced average variance estimator (GSAVE) can capture a larger portion of the class. Both estimators require no numerical optimization because they can be computed by spectral decomposition of linear operators. Finally, we compare our estimators with existing methods by simulation and on actual data sets.

#### Article information

Source
Ann. Statist., Volume 41, Number 1 (2013), 221-249.

Dates
First available in Project Euclid: 26 March 2013

Permanent link to this document
https://projecteuclid.org/euclid.aos/1364302741

Digital Object Identifier
doi:10.1214/12-AOS1071

Mathematical Reviews number (MathSciNet)
MR3059416

Zentralblatt MATH identifier
1347.62018

#### Citation

Lee, Kuang-Yao; Li, Bing; Chiaromonte, Francesca. A general theory for nonlinear sufficient dimension reduction: Formulation and estimation. Ann. Statist. 41 (2013), no. 1, 221--249. doi:10.1214/12-AOS1071. https://projecteuclid.org/euclid.aos/1364302741

#### References

• Akaho, S. (2001). A kernel method for canonical correlation analysis. In Proceedings of the International Meeting of the Psychometric Society (IMPS2001). Springer, Tokyo.
• Bach, F. R. and Jordan, M. I. (2002). Kernel independent component analysis. J. Mach. Learn. Res. 3 1–48.
• Bahadur, R. R. (1954). Sufficiency and statistical decision functions. Ann. Math. Statist. 25 423–462.
• Baker, C. R. (1973). Joint measures and cross-covariance operators. Trans. Amer. Math. Soc. 186 273–289.
• Cook, R. D. (1994). Using dimension-reduction subspaces to identify important inputs in models of physical systems. In 1994 Proceedings of the Section on Physical and Engineering Sciences 18–25. Amer. Statist. Assoc., Alexandria, VA.
• Cook, R. D. (1998a). Regression Graphics: Ideas for Studying Regressions Through Graphics. Wiley, New York.
• Cook, R. D. (1998b). Principal Hessian directions revisited. J. Amer. Statist. Assoc. 93 84–94.
• Cook, R. D. (2007). Fisher lecture: Dimension reduction in regression. Statist. Sci. 22 1–40.
• Cook, R. D. and Critchley, F. (2000). Identifying regression outliers and mixtures graphically. J. Amer. Statist. Assoc. 95 781–794.
• Cook, R. D. and Forzani, L. (2009). Likelihood-based sufficient dimension reduction. J. Amer. Statist. Assoc. 104 197–208.
• Cook, R. D. and Li, B. (2002). Dimension reduction for conditional mean in regression. Ann. Statist. 30 455–474.
• Cook, R. D., Li, B. and Chiaromonte, F. (2010). Envelope models for parsimonious and efficient multivariate linear regression (with discussion). Statist. Sinica 20 927–1010.
• Cook, R. D. and Ni, L. (2005). Sufficient dimension reduction via inverse regression: A minimum discrepancy approach. J. Amer. Statist. Assoc. 100 410–428.
• Cook, R. D. and Weisberg, S. (1991). Comment on “Sliced inverse regression for dimension reduction,” by K.-C. Li. J. Amer. Statist. Assoc. 86 328–332.
• Duan, N. and Li, K.-C. (1991). A bias bound for least squares linear regression. Statist. Sinica 1 127–136.
• Ferré, L. and Yao, A. F. (2003). Functional sliced inverse regression analysis. Statistics 37 475–488.
• Fukumizu, K., Bach, F. R. and Jordan, M. I. (2004). Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces. J. Mach. Learn. Res. 5 73–99.
• Fukumizu, K., Bach, F. R. and Gretton, A. (2007). Statistical consistency of kernel canonical correlation analysis. J. Mach. Learn. Res. 8 361–383.
• Fukumizu, K., Bach, F. R. and Jordan, M. I. (2009). Kernel dimension reduction in regression. Ann. Statist. 37 1871–1905.
• Härdle, W., Hall, P. and Ichimura, H. (1993). Optimal smoothing in single-index models. Ann. Statist. 21 157–178.
• Horn, R. A. and Johnson, C. R. (1985). Matrix Analysis. Cambridge Univ. Press, Cambridge.
• Hsing, T. and Ren, H. (2009). An RKHS formulation of the inverse regression dimension-reduction problem. Ann. Statist. 37 726–755.
• Ichimura, H. and Lee, L. F. (1991). Semiparametric least squares estimation of multiple index models: Single equation estimation. In Nonparametric and Semiparametric Methods in Econometrics and Statistics (Durham, NC, 1988) (W. A. Barnett, J. L. Powell and G. Tauchen, eds.) 3–49. Cambridge Univ. Press, Cambridge.
• Lee, K. Y., Li, B. and Chiaromonte, F. (2013). Supplement to “A general theory for nonlinear sufficient dimension reduction: Formulation and estimation.” DOI:10.1214/12-AOS1071SUPP.
• Lehmann, E. L. (1981). An interpretation of completeness and Basu’s theorem. J. Amer. Statist. Assoc. 76 335–340.
• Li, K.-C. (1991). Sliced inverse regression for dimension reduction. J. Amer. Statist. Assoc. 86 316–342.
• Li, K.-C. (1992). On principal Hessian directions for data visualization and dimension reduction: Another application of Stein’s lemma. J. Amer. Statist. Assoc. 86 316–342.
• Li, B., Artemiou, A. and Li, L. (2011). Principal support vector machines for linear and nonlinear sufficient dimension reduction. Ann. Statist. 9 3182–3210.
• Li, B., Chun, H. and Zhao, H. (2012). Sparse estimation of conditional graphical models with application to gene networks. J. Amer. Statist. Assoc. 107 152–167.
• Li, K.-C. and Duan, N. (1989). Regression analysis under link violation. Ann. Statist. 17 1009–1052.
• Li, B. and Wang, S. (2007). On directional regression for dimension reduction. J. Amer. Statist. Assoc. 102 997–1008.
• Li, B., Zha, H. and Chiaromonte, F. (2005). Contour regression: A general approach to dimension reduction. Ann. Statist. 33 1580–1616.
• Wu, H.-M. (2008). Kernel sliced inverse regression with applications to classification. J. Comput. Graph. Statist. 17 590–610.
• Wu, Q., Liang, F. and Mukherjee, S. (2008). Regularized sliced inverse regression for kernel models. Technical report, Duke Univ., Durham, NC.
• Ye, Z. and Weiss, R. E. (2003). Using the bootstrap to select one of a new class of dimension reduction methods. J. Amer. Statist. Assoc. 98 968–979.
• Yeh, Y. R., Huang, S. Y. and Lee, Y. Y. (2009). Nonlinear dimension reduction with kernel sliced inverse regression. IEEE Transactions on Knowledge and Data Engineering 21 1590–1603.
• Yin, X., Li, B. and Cook, R. D. (2008). Successive direction extraction for estimating the central subspace in a multiple-index regression. J. Multivariate Anal. 99 1733–1757.
• Zhu, H. and Li, L. (2011). Biological pathway selection through nonlinear dimension reduction. Biostatistics 12 429–444.

#### Supplemental materials

• Supplementary material: Supplement to “A general theory for nonlinear sufficient dimension reduction: Formulation and estimation”. This is supplementary appendix that contains some techincal proofs of the results in the paper.