The Annals of Statistics

An RKHS formulation of the inverse regression dimension-reduction problem

Tailen Hsing and Haobo Ren

Full-text: Open access


Suppose that Y is a scalar and X is a second-order stochastic process, where Y and X are conditionally independent given the random variables ξ1, …, ξp which belong to the closed span LX2 of X. This paper investigates a unified framework for the inverse regression dimension-reduction problem. It is found that the identification of LX2 with the reproducing kernel Hilbert space of X provides a platform for a seamless extension from the finite- to infinite-dimensional settings. It also facilitates convenient computational algorithms that can be applied to a variety of models.

Article information

Ann. Statist., Volume 37, Number 2 (2009), 726-755.

First available in Project Euclid: 10 March 2009

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62H99: None of the above, but in this section
Secondary: 62M99: None of the above, but in this section

Functional data analysis sliced inverse regression


Hsing, Tailen; Ren, Haobo. An RKHS formulation of the inverse regression dimension-reduction problem. Ann. Statist. 37 (2009), no. 2, 726--755. doi:10.1214/07-AOS589.

Export citation


  • Amato, U., Antoniadis, A. and De Feis, I. (2006). Dimension reduction in functional regression with applications. Cumput. Statist. Data Anal. 50 2422–2446.
  • Aronszajn, N. (1950). Theory of reproducing kernel. Trans. Amer. Math. Soc. 68 337–404.
  • Ash, R. B. and Gardner, M. N. (1975). Topics in Stochastic Processes. Probability and Mathematical Statistics 27. Academic Press, New York.
  • Berlinet, A. and Thomas-Agnan, C. (2004). Reproducing Kernel Hilbert Spaces in Probability. Kluwer, Dordrecht.
  • Chen, C.-H. and Li, K. C. (1998). Can SIR be as popular as multiple linear regression? Statist. Sinica 8 289–316.
  • Chiaromonte, F. and Martinelli, J. (2002). Dimension reduction strategies for analyzing global gene expression data with a response. Math. Biosci. 176 123–144.
  • Cook, R. D. (1998). Regression Graphics. Wiley, New York.
  • Cook, R. D. and Li, B. (2002). Dimension reduction for conditional mean in regression. Ann. Statist. 30 455–474.
  • Dauxois, J., Ferré, L. and Yao, A. F. (2001). Un modèle semi-paramétrique pour variables aléatoires hilbertiennes. C. R. Acad. Sci. Paris Sér. I Math. 333 947–952.
  • Dauxois, J., Pousse, A. and Romain, Y. (1982). Asymptotic theory for the principal component analysis of a vector random function: Some applications to statistical inference. J. Multivariate Anal. 12 136–154.
  • Driscoll, M. F. (1973). The reproducing kernel Hilbert space structure of the sample paths of a Gaussian process. Z. Wahrsch. Verw. Gebiete 26 309–316.
  • Duan, N. and Li, K. C. (1991). Slicing regression: A link-free regression method. Ann. Statist. 19 505–530.
  • Dunford, N. and Schwarz, J. T. (1988). Linear Operators. Wiley, New York.
  • Eubank, R. and Hsing, T. (2007). Canonical correlation for stochastic processes. Stochastic Process. Appl. 118 1634–1661.
  • Ferré, L. and Yao, A. F. (2003). Functional sliced inverse regression analysis. Statistics 37 475–488.
  • Ferré, L. and Yao, A. F. (2005). Smooth function inverse regression. Statist. Sinica 15 665–683.
  • Ferraty, F. and Vieu, P. (2006). Nonparametric Functional Data Analysis: Theory and Practice. Springer, New York.
  • Fortet, R. M. (1973). Espaces à noyau reproduisant et lois de probabilités des fonctions alèatoires. Ann. Inst. H. Poincaré Ser. B (N.S.) 9 41–48.
  • Gohberg, I. C. and Kreĭn, M. G. (1969). Introduction to the Theory of Linear Nonselfadjoint Operators. Amer. Math. Soc., Providence, RI.
  • Gu, C. (2002). Smoothing Spline ANOVA Models. Springer, New York.
  • Hall, P. and Li, K. C. (1993). On almost linearity of low-dimensional projections from high-dimensional data. Ann. Statist. 21 867–889.
  • Horn, R. A. and Johnson, C. R. (1990). Matrix Analysis. Cambridge Univ. Press., Cambridge.
  • James, G., Hastie, T. and Sugar, C. (2000). Principal component models for sparse functional data. Biometrika 87 587–602.
  • Li, K. C. (1991). Sliced inverse regression for dimension-reduction. J. Amer. Statist. Assoc. 86 316–342.
  • Li, K. C. (1992). On Principal Hessian directions for data visualization and dimension-reduction: Another application of Stein’s lemma. J. Amer. Statist. Assoc. 87 1025–1039.
  • Loève, M. (1948). Fonctions Aléatoires du Second Ordre. Supplement to P. Lévy. Processus Stochastiques et Mouvement Brownien. Gauthier-Villars, Paris.
  • Lukić, M. N. and Beder, J. H. (2001). Stochastic process with sample paths in reproducing kernel Hilbert spaces. Trans. Amer. Math. Soc. 353 3945–3969.
  • Parzen, E. (1959). Statistical inference on time series by Hilbert space methods. I. Technical Report, No. 23, Dept. Statistics, Stanford Univ.
  • Parzen, E. (1961a). An approach to time series analysis. Ann. Math. Statist. 32 951–989.
  • Parzen, E. (1961b). Regression analysis of continuous parameter time series. In Proc. 4th Berkeley Sympos. Math. Statist. Probab. 1 469–489. Univ. California Press, Berkeley.
  • Parzen, E. (1963). Probability density functionals and reproducing kernel Hilbert spaces. Proc. Sympos. Time Series Analysis (M. Rosenblatt, ed.) 155–169. Wiley, New York.
  • Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed. Springer.
  • Rice, J. A. and Silverman, B. W. (1991). Estimating the mean and covariance structure nonparametrically when the data are curves. J. Roy. Statist. Soc. Ser. B 53 233–243.
  • Samorodnitsky, G. and Taqqu, M. S. (1994). Stable Non-Gaussian Random Processes. Stochastic Models with Infinite Variance. Chapman and Hall, New York.
  • Silverman, B. W. (1996). Smoothed functional principal components analysis by choice of norm. Ann. Statist. 24 1–24.
  • Wahba, G. (1990). Spline Models for Observational Data. CBMS 59, SIAM, Philadelphia, PA.
  • Wu, W. B. and Pourahmadi, M. (2003). Nonparametric estimation of large covariance matrices of longitudinal data. Biometrika 90 831–844.