Open Access
Translator Disclaimer
2022 Post-model-selection inference in linear regression models: An integrated review
Dongliang Zhang, Abbas Khalili, Masoud Asgharian
Author Affiliations +
Statist. Surv. 16: 86-136 (2022). DOI: 10.1214/22-SS135

Abstract

The research on statistical inference after data-driven model selection can be traced as far back as Koopmans (1949). The intensive research on modern model selection methods for high-dimensional data over the past three decades revived the interest in statistical inference after model selection. In recent years, there has been a surge of articles on statistical inference after model selection and now a rather vast literature exists on this topic. Our manuscript aims at presenting a holistic review of post-model-selection inference in linear regression models, while also incorporating perspectives from high-dimensional inference in these models. We first give a simulated example motivating the necessity for valid statistical inference after model selection. We then provide theoretical insights explaining the phenomena observed in the example. This is done through a literature survey on the post-selection sampling distribution of regression parameter estimators and properties of coverage probabilities of naïve confidence intervals. Categorized according to two types of estimation targets, namely the population- and projection-based regression coefficients, we present a review of recent uncertainty assessment methods. We also discuss possible pros and cons for the confidence intervals constructed by different methods.

Funding Statement

Abbas Khalili is supported by the Natural Sciences and Engineering Research Council of Canada (NSERC RGPIN-2020-05011), and Masoud Asgharian is supported by the Natural Science and Engineering Research Council of Canada (NSERC RGPIN-2018-05618).

Acknowledgments

The authors would like to thank the co-editor Professor Richard Lockhart, an associate editor, and three referees for their thoughtful and constructive comments. This work is based on the master thesis of Dongliang Zhang written in the department of Mathematics and Statistics at McGill University. Dongliang Zhang also thanks Professors Martin Lindquist and Mei-Cheng Wang, his PhD advisors at Johns Hopkins University, for their support during the completion of this work.

Citation

Download Citation

Dongliang Zhang. Abbas Khalili. Masoud Asgharian. "Post-model-selection inference in linear regression models: An integrated review." Statist. Surv. 16 86 - 136, 2022. https://doi.org/10.1214/22-SS135

Information

Received: 1 April 2021; Published: 2022
First available in Project Euclid: 7 March 2022

Digital Object Identifier: 10.1214/22-SS135

Subjects:
Primary: 62F25
Secondary: 62J07

Keywords: high-dimensional linear models , Model selection , population- and projection-based regression coefficients , Post-selection inference

JOURNAL ARTICLE
51 PAGES


SHARE
Vol.16 • 2022
Back to Top