## The Annals of Statistics

### Asymptotically minimax regret procedures in regression model selection and the magnitude of the dimension penalty

#### Abstract

This paper addresses the topic of model selection in regression.We emphasize the case of two models, testing which model provides a better prediction based on $n$ observations. Within a family of selection rules, based on maximizing a penalized log-likelihood under a normal model, we search for asymptotically minimax rules over a class $\mathscr{G}$ of possible joint distributions of the explanatory and response variables. For the class $\mathscr{G}$ of multivariate normal joint distributions it is shown that asymptotically minimax selection rules are close to the AIC selection rule when the models’ dimension difference is large. It is further proved that under fairly mild assumptions on $\mathscr{G}$ any asymptotically minimax sequence of procedures satisfies the condition that the difference in their dimension penalties is bounded as the number of observations approaches infinity. The results are then extended to the case of more than two competing models.

#### Article information

Source
Ann. Statist., Volume 28, Number 6 (2000), 1620-1637.

Dates
First available in Project Euclid: 12 March 2002

https://projecteuclid.org/euclid.aos/1015957473

Digital Object Identifier
doi:10.1214/aos/1015957473

Mathematical Reviews number (MathSciNet)
MR1835034

Zentralblatt MATH identifier
1105.62356

Subjects
Primary: 62J05: Linear regression 62C20: Minimax procedures

#### Citation

Goldenshluger, Alexander; Greenshtein, Eitan. Asymptotically minimax regret procedures in regression model selection and the magnitude of the dimension penalty. Ann. Statist. 28 (2000), no. 6, 1620--1637. doi:10.1214/aos/1015957473. https://projecteuclid.org/euclid.aos/1015957473

#### References

• Akaike, H. (1974). A newlook at the statistical identification model. IEEE Trans.Automat. Control 19 716-723.
• Breiman, L. and Freedman, D. (1983). Howmany variables should be entered in a regression equation? J.Amer.Statist.Assoc.78 131-136.
• Foster, D. P. and George, E. I. (1994). The risk inflation criterion for multiple regression. Ann. Statist. 22 1947-1975.
• Geisser, S. (1975). The predictive sample reuse method with applications. J.Amer.Statist.Assoc. 70 320-328.
• Goldenshluger, A. and Greenshtein, E. (1998). Asymptotic minimax procedures in regression model selection and the magnitude of the dimension penallty. Technical report, Technion-Israel Institute of Technology.
• Greenshtein, E. (2000). Predictor-selection. Another look on model-selection and estimation. Technical report, Technion-Israel Institute of Technology.
• Hannan, E. J. and Quinn, B. G. (1979). The determination of the order of an autoregression. J.Roy.Statist.Soc.Ser.B 41 190-195.
• Linhart, H. and Zuchini, W. (1986). Model selection. Wiley, NewYork.
• Mallows, C. L. (1973). Some comments on Cp. Technometrics 15 661-675.
• Nishii, R. (1984). Asymptotic properties of criteria for selection of variables in multiple regression. Ann.Statist.12 758-765.
• Oliker, V. I. (1978). On the relationship between the sample size and the number of variables in a linear regression model. Comm.Statist.A 7 509-516.
• Rissanen, J. (1989). Stochastic Complexity in Statistical Inquiry. World Books, Singapore.
• Schwarz, G. (1978). Estimating the dimension of a model. Ann.Statist.6 461-464.
• Shao, J. (1997). An asymptotic theory for linear model selection. Statist.Sinica 7 221-264.
• Shibata, R. (1981). An optimal selection of regression variables. Biometrika 68 45-54.
• Shibata, R. (1986). Selection of the number of regression variables: a minimax choice of generalized FPE. Ann.Inst.Statist.Math.38 459-474.
• Shibata, R. (1989). Statistical aspects of model selection. From Data to Model (J. C. Willems, ed.) 215-240. Springer, NewYork.
• Speed, T. and Yu, B. (1993). Model selection and prediction: normal regression. Ann.Inst.Statist. Math. 45 35-54.
• Stone, C. J. (1981). Admissible selection of an accurate and parsimonious normal linear regression model. Ann.Statist.9 475-485.
• Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. J.Royal Statist.Soc.Ser.B 36 111-147.
• Stone, M. (1977). An asymptotic equivalence of choice of model by cross-validation and Akaike's criterion. J.Royal Statist.Soc.Ser.B 39 44-47.
• Thompson, M. L. (1978). Selection of variables in multiple regression. Internat.Statist.Rev.46 1-49 and 129-146.