Penalized estimating equations for generalized linear models with multiple imputation

Yang Li; Haoyu Yang; Haochen Yu; Hanwen Huang; Ye Shen

doi:10.1214/22-AOAS1721

Abstract

Missing values among variables present a challenge in variable selection in the generalized linear model. Common strategies that delete observations with missing information may cause serious information loss. Multiple imputation has been widely used in recent years because it provides unbiased statistical results given a correctly specified imputation model and considers the uncertainty of the missing data. However, variable selection methods in the generalized linear model with multiply-imputed data have not yet been studied widely. In this study, we introduce penalized estimating equations for generalized linear models with multiple imputation (PEE–MI), which incorporates the correlation of multiple imputed observations into the objective function. The theoretical performance of the proposed PEE–MI depends on the penalized function adopted. We use the adaptive least absolute shrinkage and selection operator (adaptive LASSO) as an illustrating example. Simulations show that PEE–MI outperforms the alternatives. The proposed method is shown to select variables with clinical relevance when applied to a database of laboratory-diagnosed A/H7N9 patients in the Zhejiang province, China.

Funding Statement

Dr. Li’s work is supported by the Natural Science Foundation of China (72271237) and the Platform of Public Health and Disease Control and Prevention, Major Innovation and Planning Interdisciplinary Platform for the “Double-First Class” Initiative, Renmin University of China.

Acknowledgments

The authors thank Ms. Lin Li for her productive discussion.

Citation

Download Citation

Yang Li. Haoyu Yang. Haochen Yu. Hanwen Huang. Ye Shen. "Penalized estimating equations for generalized linear models with multiple imputation." Ann. Appl. Stat. 17 (3) 2345 - 2363, September 2023. https://doi.org/10.1214/22-AOAS1721

Information

Received: 1 February 2022; Revised: 1 September 2022; Published: September 2023

First available in Project Euclid: 7 September 2023

MathSciNet: MR4637670

Digital Object Identifier: 10.1214/22-AOAS1721

Keywords: missing data , multiple imputation , penalized estimating equations , Variable selection

Abstract

Funding Statement

Acknowledgments

Citation

Information

KEYWORDS/PHRASES

PUBLICATION TITLE:

PUBLICATION YEARS