## Annals of Statistics

- Ann. Statist.
- Volume 45, Number 6 (2017), 2654-2679.

### A weight-relaxed model averaging approach for high-dimensional generalized linear models

#### Abstract

Model averaging has long been proposed as a powerful alternative to model selection in regression analysis. However, how well it performs in high-dimensional regression is still poorly understood. Recently, Ando and Li [*J. Amer. Statist. Assoc.* **109** (2014) 254–265] introduced a new method of model averaging that allows the number of predictors to increase as the sample size increases. One notable feature of Ando and Li’s method is the relaxation on the total model weights so that weak signals can be efficiently combined from high-dimensional linear models. It is natural to ask if Ando and Li’s method and results can be extended to nonlinear models. Because all candidate models should be treated as working models, the existence of a theoretical target of the quasi maximum likelihood estimator under model misspecification needs to be established first. In this paper, we consider generalized linear models as our candidate models. We establish a general result to show the existence of pseudo-true regression parameters under model misspecification. We derive proper conditions for the leave-one-out cross-validation weight selection to achieve asymptotic optimality. Technically, the pseudo true target parameters between working models are not linearly linked. To overcome the encountered difficulties, we employ a novel strategy of decomposing and bounding the bias and variance terms in our proof. We conduct simulations to illustrate the merits of our model averaging procedure over several existing methods, including the lasso and group lasso methods, the Akaike and Bayesian information criterion model-averaging methods and some other state-of-the-art regularization methods.

#### Article information

**Source**

Ann. Statist., Volume 45, Number 6 (2017), 2654-2679.

**Dates**

Received: December 2015

Revised: September 2016

First available in Project Euclid: 15 December 2017

**Permanent link to this document**

https://projecteuclid.org/euclid.aos/1513328586

**Digital Object Identifier**

doi:10.1214/17-AOS1538

**Mathematical Reviews number (MathSciNet)**

MR3737905

**Zentralblatt MATH identifier**

06838146

**Subjects**

Primary: 62J12: Generalized linear models

Secondary: 62F99: None of the above, but in this section

**Keywords**

Asymptotic optimality high-dimensional regression models model averaging model misspecification

#### Citation

Ando, Tomohiro; Li, Ker-chau. A weight-relaxed model averaging approach for high-dimensional generalized linear models. Ann. Statist. 45 (2017), no. 6, 2654--2679. doi:10.1214/17-AOS1538. https://projecteuclid.org/euclid.aos/1513328586

#### Supplemental materials

- Supplementary material. Due to space constraints, the proof of the claims (4.8) and (4.9), the proof of Lemma 3, and further simulation studies are relegated to the supplementary document. Supplementary document also contains Theorem 3 and Lemma 4.Digital Object Identifier: doi:10.1214/17-AOS1538SUPP