Electronic Journal of Statistics

Model-free envelope dimension selection

Xin Zhang and Qing Mai

Full-text: Open access

Abstract

An envelope is a targeted dimension reduction subspace for simultaneously achieving dimension reduction and improving parameter estimation efficiency. While many envelope methods have been proposed in recent years, all envelope methods hinge on the knowledge of a key hyperparameter, the structural dimension of the envelope. How to estimate the envelope dimension consistently is of substantial interest from both theoretical and practical aspects. Moreover, very recent advances in the literature have generalized envelope as a model-free method, which makes selecting the envelope dimension even more challenging. Likelihood-based approaches such as information criteria and likelihood-ratio tests either cannot be directly applied or have no theoretical justification. To address this critical issue of dimension selection, we propose two unified approaches – called FG and 1D selections – for determining the envelope dimension that can be applied to any envelope models and methods. The two model-free selection approaches are based on the two different envelope optimization procedures: the full Grassmannian (FG) optimization and the 1D algorithm [11], and are shown to be capable of correctly identifying the structural dimension with a probability tending to 1 under mild moment conditions as the sample size increases. While the FG selection unifies and generalizes the BIC and modified BIC approaches that existing in the literature, and hence provides the theoretical justification of them under weak moment condition and model-free context, the 1D selection is computationally more stable and efficient in finite sample. Extensive simulations and a real data analysis demonstrate the superb performance of our proposals.

Article information

Source
Electron. J. Statist., Volume 12, Number 2 (2018), 2193-2216.

Dates
Received: September 2017
First available in Project Euclid: 17 July 2018

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1531814505

Digital Object Identifier
doi:10.1214/18-EJS1449

Keywords
Dimension reduction envelope models and methods information criterion model selection

Rights
Creative Commons Attribution 4.0 International License.

Citation

Zhang, Xin; Mai, Qing. Model-free envelope dimension selection. Electron. J. Statist. 12 (2018), no. 2, 2193--2216. doi:10.1214/18-EJS1449. https://projecteuclid.org/euclid.ejs/1531814505


Export citation

References

  • [1] Akaike, H. (1974). A new look at the statistical model identification., IEEE Transactions on Automatic Control, 19(6):716–723.
  • [2] Bura, E. and Cook, R. D. (2003). Rank estimation in reduced-rank regression., Journal of Multivariate Analysis, 87(1):159–176.
  • [3] Conway, J. (1990)., A Course in Functional Analysis. Second edition. Springer, New York.
  • [4] Cook, R. D., Forzani, L., and Zhang, X. (2015). Envelopes and reduced-rank regression., Biometrika, 102(2):439–456.
  • [5] Cook, R. D., Helland, I. S., and Su, Z. (2013). Envelopes and partial least squares regression., J. R. Stat. Soc. Ser. B. Stat. Methodol., 75(5):851–877.
  • [6] Cook, R. D. and Li, B. (2004). Determining the dimension of iterative hessian transformation., The Annals of Statistics, 32(6):2501–2531.
  • [7] Cook, R. D., Li, B., and Chiaromonte, F. (2010). Envelope models for parsimonious and efficient multivariate linear regression., Statist. Sinica, 20(3):927–960.
  • [8] Cook, R. D. and Su, Z. (2013). Scaled envelopes: scale-invariant and efficient estimation in multivariate linear regression., Biometrika, 100(4):939–954.
  • [9] Cook, R. D. and Zhang, X. (2015a). Foundations for envelope models and methods., Journal of the American Statistical Association, 110(510):599–611.
  • [10] Cook, R. D. and Zhang, X. (2015b). Simultaneous envelopes for multivariate linear regression., Technometrics, 57(1):11–25.
  • [11] Cook, R. D. and Zhang, X. (2016). Algorithms for envelope estimation., Journal of Computational and Graphical Statistics, 25(1):284–300.
  • [12] Cook, R. D. and Zhang, X. (2018). Fast envelope algorithms., Statistica Sinica, 28(3):1179–1197.
  • [13] Eck, D. J. and Cook, R. D. (2017). Weighted envelope estimation to handle variability in model selection., Biometrika, 104(3):743–749.
  • [14] Eck, D. J., Geyer, C. J., and Cook, R. D. (2017). An application of envelope methodology and aster models., arXiv preprint arXiv:1701.07910.
  • [15] Geyer, C. J., Wagenius, S., and Shaw, R. G. (2007). Aster models for life history analysis., Biometrika, 94(2):415–426.
  • [16] Hawkins, D. M. and Maboudou-Tchao, E. M. (2013). Smoothed linear modeling for smooth spectral data., International Journal of Spectroscopy.
  • [17] Li, L. and Zhang, X. (2017). Parsimonious tensor response regression., Journal of the American Statistical Association, 112(519):1131–1146.
  • [18] Ma, Y. and Zhang, X. (2015). A validated information criterion to determine the structural dimension in dimension reduction models., Biometrika, 102(2):409–420.
  • [19] Schott, J. R. (1994). Determining the dimensionality in sliced inverse regression., Journal of the American Statistical Association, 89(425):141–148.
  • [20] Schott, J. R. (2013). On the likelihood ratio test for envelope models in multivariate linear regression., Biometrika, 100(2):531–537.
  • [21] Schwarz, G. et al. (1978). Estimating the dimension of a model., The annals of statistics, 6(2):461–464.
  • [22] Su, Z. and Cook, R. D. (2011). Partial envelopes for efficient estimation in multivariate linear regression., Biometrika, 98(1):133–146.
  • [23] Yu, Y., Wang, T., and Samworth, R. (2015). A useful variant of the davis-kahan theorem for statisticians., Biometrika, 102(2).
  • [24] Zeng, P. (2008). Determining the dimension of the central subspace and central mean subspace., Biometrika, 95(2):469–479.
  • [25] Zhang, X. and Li, L. (2017). Tensor envelope partial least-squares regression., Technometrics, 59(4):426–436.
  • [26] Zhu, L., Miao, B., and Peng, H. (2006). On sliced inverse regression with high-dimensional covariates., Journal of the American Statistical Association, 101(474):630–643.
  • [27] Zhu, L.-P., Zhu, L.-X., and Feng, Z.-H. (2010). Dimension reduction in regressions through cumulative slicing estimation., Journal of the American Statistical Association, 105(492):1455–1466.
  • [28] Zhu, X., Wang, T., and Zhu, L. (2016). Dimensionality determination: a thresholding double ridge ratio criterion., arXiv preprint arXiv:1608.04457.
  • [29] Zou, C. and Chen, X. (2012). On the consistency of coordinate-independent sparse estimation with bic., Journal of Multivariate Analysis, 112:248–255.