Open Access
December 2017 A weight-relaxed model averaging approach for high-dimensional generalized linear models
Tomohiro Ando, Ker-chau Li
Ann. Statist. 45(6): 2654-2679 (December 2017). DOI: 10.1214/17-AOS1538

Abstract

Model averaging has long been proposed as a powerful alternative to model selection in regression analysis. However, how well it performs in high-dimensional regression is still poorly understood. Recently, Ando and Li [J. Amer. Statist. Assoc. 109 (2014) 254–265] introduced a new method of model averaging that allows the number of predictors to increase as the sample size increases. One notable feature of Ando and Li’s method is the relaxation on the total model weights so that weak signals can be efficiently combined from high-dimensional linear models. It is natural to ask if Ando and Li’s method and results can be extended to nonlinear models. Because all candidate models should be treated as working models, the existence of a theoretical target of the quasi maximum likelihood estimator under model misspecification needs to be established first. In this paper, we consider generalized linear models as our candidate models. We establish a general result to show the existence of pseudo-true regression parameters under model misspecification. We derive proper conditions for the leave-one-out cross-validation weight selection to achieve asymptotic optimality. Technically, the pseudo true target parameters between working models are not linearly linked. To overcome the encountered difficulties, we employ a novel strategy of decomposing and bounding the bias and variance terms in our proof. We conduct simulations to illustrate the merits of our model averaging procedure over several existing methods, including the lasso and group lasso methods, the Akaike and Bayesian information criterion model-averaging methods and some other state-of-the-art regularization methods.

Citation

Download Citation

Tomohiro Ando. Ker-chau Li. "A weight-relaxed model averaging approach for high-dimensional generalized linear models." Ann. Statist. 45 (6) 2654 - 2679, December 2017. https://doi.org/10.1214/17-AOS1538

Information

Received: 1 December 2015; Revised: 1 September 2016; Published: December 2017
First available in Project Euclid: 15 December 2017

zbMATH: 06838146
MathSciNet: MR3737905
Digital Object Identifier: 10.1214/17-AOS1538

Subjects:
Primary: 62J12
Secondary: 62F99

Keywords: asymptotic optimality , high-dimensional regression models , model averaging , model misspecification

Rights: Copyright © 2017 Institute of Mathematical Statistics

Vol.45 • No. 6 • December 2017
Back to Top