Open Access
2018 Model-free envelope dimension selection
Xin Zhang, Qing Mai
Electron. J. Statist. 12(2): 2193-2216 (2018). DOI: 10.1214/18-EJS1449
Abstract

An envelope is a targeted dimension reduction subspace for simultaneously achieving dimension reduction and improving parameter estimation efficiency. While many envelope methods have been proposed in recent years, all envelope methods hinge on the knowledge of a key hyperparameter, the structural dimension of the envelope. How to estimate the envelope dimension consistently is of substantial interest from both theoretical and practical aspects. Moreover, very recent advances in the literature have generalized envelope as a model-free method, which makes selecting the envelope dimension even more challenging. Likelihood-based approaches such as information criteria and likelihood-ratio tests either cannot be directly applied or have no theoretical justification. To address this critical issue of dimension selection, we propose two unified approaches – called FG and 1D selections – for determining the envelope dimension that can be applied to any envelope models and methods. The two model-free selection approaches are based on the two different envelope optimization procedures: the full Grassmannian (FG) optimization and the 1D algorithm [11], and are shown to be capable of correctly identifying the structural dimension with a probability tending to 1 under mild moment conditions as the sample size increases. While the FG selection unifies and generalizes the BIC and modified BIC approaches that existing in the literature, and hence provides the theoretical justification of them under weak moment condition and model-free context, the 1D selection is computationally more stable and efficient in finite sample. Extensive simulations and a real data analysis demonstrate the superb performance of our proposals.

References

1.

[1] Akaike, H. (1974). A new look at the statistical model identification., IEEE Transactions on Automatic Control, 19(6):716–723. 0314.62039 10.1109/TAC.1974.1100705[1] Akaike, H. (1974). A new look at the statistical model identification., IEEE Transactions on Automatic Control, 19(6):716–723. 0314.62039 10.1109/TAC.1974.1100705

2.

[2] Bura, E. and Cook, R. D. (2003). Rank estimation in reduced-rank regression., Journal of Multivariate Analysis, 87(1):159–176. MR2007266 1030.62045 10.1016/S0047-259X(03)00029-0[2] Bura, E. and Cook, R. D. (2003). Rank estimation in reduced-rank regression., Journal of Multivariate Analysis, 87(1):159–176. MR2007266 1030.62045 10.1016/S0047-259X(03)00029-0

3.

[3] Conway, J. (1990)., A Course in Functional Analysis. Second edition. Springer, New York. 0706.46003[3] Conway, J. (1990)., A Course in Functional Analysis. Second edition. Springer, New York. 0706.46003

4.

[4] Cook, R. D., Forzani, L., and Zhang, X. (2015). Envelopes and reduced-rank regression., Biometrika, 102(2):439–456. 06450877 10.1093/biomet/asv001[4] Cook, R. D., Forzani, L., and Zhang, X. (2015). Envelopes and reduced-rank regression., Biometrika, 102(2):439–456. 06450877 10.1093/biomet/asv001

5.

[5] Cook, R. D., Helland, I. S., and Su, Z. (2013). Envelopes and partial least squares regression., J. R. Stat. Soc. Ser. B. Stat. Methodol., 75(5):851–877.[5] Cook, R. D., Helland, I. S., and Su, Z. (2013). Envelopes and partial least squares regression., J. R. Stat. Soc. Ser. B. Stat. Methodol., 75(5):851–877.

6.

[6] Cook, R. D. and Li, B. (2004). Determining the dimension of iterative hessian transformation., The Annals of Statistics, 32(6):2501–2531. 1069.62033 10.1214/009053604000000661 euclid.aos/1107794877[6] Cook, R. D. and Li, B. (2004). Determining the dimension of iterative hessian transformation., The Annals of Statistics, 32(6):2501–2531. 1069.62033 10.1214/009053604000000661 euclid.aos/1107794877

7.

[7] Cook, R. D., Li, B., and Chiaromonte, F. (2010). Envelope models for parsimonious and efficient multivariate linear regression., Statist. Sinica, 20(3):927–960. 1259.62059[7] Cook, R. D., Li, B., and Chiaromonte, F. (2010). Envelope models for parsimonious and efficient multivariate linear regression., Statist. Sinica, 20(3):927–960. 1259.62059

8.

[8] Cook, R. D. and Su, Z. (2013). Scaled envelopes: scale-invariant and efficient estimation in multivariate linear regression., Biometrika, 100(4):939–954. 06247598 10.1093/biomet/ast026[8] Cook, R. D. and Su, Z. (2013). Scaled envelopes: scale-invariant and efficient estimation in multivariate linear regression., Biometrika, 100(4):939–954. 06247598 10.1093/biomet/ast026

9.

[9] Cook, R. D. and Zhang, X. (2015a). Foundations for envelope models and methods., Journal of the American Statistical Association, 110(510):599–611. 1390.62131 10.1080/01621459.2014.983235[9] Cook, R. D. and Zhang, X. (2015a). Foundations for envelope models and methods., Journal of the American Statistical Association, 110(510):599–611. 1390.62131 10.1080/01621459.2014.983235

10.

[10] Cook, R. D. and Zhang, X. (2015b). Simultaneous envelopes for multivariate linear regression., Technometrics, 57(1):11–25.[10] Cook, R. D. and Zhang, X. (2015b). Simultaneous envelopes for multivariate linear regression., Technometrics, 57(1):11–25.

11.

[11] Cook, R. D. and Zhang, X. (2016). Algorithms for envelope estimation., Journal of Computational and Graphical Statistics, 25(1):284–300.[11] Cook, R. D. and Zhang, X. (2016). Algorithms for envelope estimation., Journal of Computational and Graphical Statistics, 25(1):284–300.

12.

[12] Cook, R. D. and Zhang, X. (2018). Fast envelope algorithms., Statistica Sinica, 28(3):1179–1197. 1394.62067[12] Cook, R. D. and Zhang, X. (2018). Fast envelope algorithms., Statistica Sinica, 28(3):1179–1197. 1394.62067

13.

[13] Eck, D. J. and Cook, R. D. (2017). Weighted envelope estimation to handle variability in model selection., Biometrika, 104(3):743–749.[13] Eck, D. J. and Cook, R. D. (2017). Weighted envelope estimation to handle variability in model selection., Biometrika, 104(3):743–749.

14.

[14] Eck, D. J., Geyer, C. J., and Cook, R. D. (2017). An application of envelope methodology and aster models., arXiv preprint  arXiv:1701.07910.[14] Eck, D. J., Geyer, C. J., and Cook, R. D. (2017). An application of envelope methodology and aster models., arXiv preprint  arXiv:1701.07910.

15.

[15] Geyer, C. J., Wagenius, S., and Shaw, R. G. (2007). Aster models for life history analysis., Biometrika, 94(2):415–426. 1132.62090 10.1093/biomet/asm030[15] Geyer, C. J., Wagenius, S., and Shaw, R. G. (2007). Aster models for life history analysis., Biometrika, 94(2):415–426. 1132.62090 10.1093/biomet/asm030

16.

[16] Hawkins, D. M. and Maboudou-Tchao, E. M. (2013). Smoothed linear modeling for smooth spectral data., International Journal of Spectroscopy.[16] Hawkins, D. M. and Maboudou-Tchao, E. M. (2013). Smoothed linear modeling for smooth spectral data., International Journal of Spectroscopy.

17.

[17] Li, L. and Zhang, X. (2017). Parsimonious tensor response regression., Journal of the American Statistical Association, 112(519):1131–1146.[17] Li, L. and Zhang, X. (2017). Parsimonious tensor response regression., Journal of the American Statistical Association, 112(519):1131–1146.

18.

[18] Ma, Y. and Zhang, X. (2015). A validated information criterion to determine the structural dimension in dimension reduction models., Biometrika, 102(2):409–420. 06450875 10.1093/biomet/asv004[18] Ma, Y. and Zhang, X. (2015). A validated information criterion to determine the structural dimension in dimension reduction models., Biometrika, 102(2):409–420. 06450875 10.1093/biomet/asv004

19.

[19] Schott, J. R. (1994). Determining the dimensionality in sliced inverse regression., Journal of the American Statistical Association, 89(425):141–148. MR1266291 0791.62069 10.1080/01621459.1994.10476455[19] Schott, J. R. (1994). Determining the dimensionality in sliced inverse regression., Journal of the American Statistical Association, 89(425):141–148. MR1266291 0791.62069 10.1080/01621459.1994.10476455

20.

[20] Schott, J. R. (2013). On the likelihood ratio test for envelope models in multivariate linear regression., Biometrika, 100(2):531–537. 1284.62335 10.1093/biomet/ast002[20] Schott, J. R. (2013). On the likelihood ratio test for envelope models in multivariate linear regression., Biometrika, 100(2):531–537. 1284.62335 10.1093/biomet/ast002

21.

[21] Schwarz, G. et al. (1978). Estimating the dimension of a model., The annals of statistics, 6(2):461–464. 0379.62005 10.1214/aos/1176344136 euclid.aos/1176344136[21] Schwarz, G. et al. (1978). Estimating the dimension of a model., The annals of statistics, 6(2):461–464. 0379.62005 10.1214/aos/1176344136 euclid.aos/1176344136

22.

[22] Su, Z. and Cook, R. D. (2011). Partial envelopes for efficient estimation in multivariate linear regression., Biometrika, 98(1):133–146. 1214.62062 10.1093/biomet/asq063[22] Su, Z. and Cook, R. D. (2011). Partial envelopes for efficient estimation in multivariate linear regression., Biometrika, 98(1):133–146. 1214.62062 10.1093/biomet/asq063

23.

[23] Yu, Y., Wang, T., and Samworth, R. (2015). A useful variant of the davis-kahan theorem for statisticians., Biometrika, 102(2). 06450868 10.1093/biomet/asv008[23] Yu, Y., Wang, T., and Samworth, R. (2015). A useful variant of the davis-kahan theorem for statisticians., Biometrika, 102(2). 06450868 10.1093/biomet/asv008

24.

[24] Zeng, P. (2008). Determining the dimension of the central subspace and central mean subspace., Biometrika, 95(2):469–479. 05563407 10.1093/biomet/asn002[24] Zeng, P. (2008). Determining the dimension of the central subspace and central mean subspace., Biometrika, 95(2):469–479. 05563407 10.1093/biomet/asn002

25.

[25] Zhang, X. and Li, L. (2017). Tensor envelope partial least-squares regression., Technometrics, 59(4):426–436. MR3740960 10.1080/00401706.2016.1272495[25] Zhang, X. and Li, L. (2017). Tensor envelope partial least-squares regression., Technometrics, 59(4):426–436. MR3740960 10.1080/00401706.2016.1272495

26.

[26] Zhu, L., Miao, B., and Peng, H. (2006). On sliced inverse regression with high-dimensional covariates., Journal of the American Statistical Association, 101(474):630–643. 1119.62331 10.1198/016214505000001285[26] Zhu, L., Miao, B., and Peng, H. (2006). On sliced inverse regression with high-dimensional covariates., Journal of the American Statistical Association, 101(474):630–643. 1119.62331 10.1198/016214505000001285

27.

[27] Zhu, L.-P., Zhu, L.-X., and Feng, Z.-H. (2010). Dimension reduction in regressions through cumulative slicing estimation., Journal of the American Statistical Association, 105(492):1455–1466. 1388.62121 10.1198/jasa.2010.tm09666[27] Zhu, L.-P., Zhu, L.-X., and Feng, Z.-H. (2010). Dimension reduction in regressions through cumulative slicing estimation., Journal of the American Statistical Association, 105(492):1455–1466. 1388.62121 10.1198/jasa.2010.tm09666

28.

[28] Zhu, X., Wang, T., and Zhu, L. (2016). Dimensionality determination: a thresholding double ridge ratio criterion., arXiv preprint  arXiv:1608.04457.[28] Zhu, X., Wang, T., and Zhu, L. (2016). Dimensionality determination: a thresholding double ridge ratio criterion., arXiv preprint  arXiv:1608.04457.

29.

[29] Zou, C. and Chen, X. (2012). On the consistency of coordinate-independent sparse estimation with bic., Journal of Multivariate Analysis, 112:248–255. 1273.62135 10.1016/j.jmva.2012.04.014[29] Zou, C. and Chen, X. (2012). On the consistency of coordinate-independent sparse estimation with bic., Journal of Multivariate Analysis, 112:248–255. 1273.62135 10.1016/j.jmva.2012.04.014
Xin Zhang and Qing Mai "Model-free envelope dimension selection," Electronic Journal of Statistics 12(2), 2193-2216, (2018). https://doi.org/10.1214/18-EJS1449
Received: 1 September 2017; Published: 2018
Vol.12 • No. 2 • 2018
Back to Top