Open Access
December 2019 Bootstrapping and sample splitting for high-dimensional, assumption-lean inference
Alessandro Rinaldo, Larry Wasserman, Max G’Sell
Ann. Statist. 47(6): 3438-3469 (December 2019). DOI: 10.1214/18-AOS1784

Abstract

Several new methods have been recently proposed for performing valid inference after model selection. An older method is sample splitting: use part of the data for model selection and the rest for inference. In this paper, we revisit sample splitting combined with the bootstrap (or the Normal approximation). We show that this leads to a simple, assumption-lean approach to inference and we establish results on the accuracy of the method. In fact, we find new bounds on the accuracy of the bootstrap and the Normal approximation for general nonlinear parameters with increasing dimension which we then use to assess the accuracy of regression inference. We define new parameters that measure variable importance and that can be inferred with greater accuracy than the usual regression coefficients. Finally, we elucidate an inference-prediction trade-off: splitting increases the accuracy and robustness of inference but can decrease the accuracy of the predictions.

Citation

Download Citation

Alessandro Rinaldo. Larry Wasserman. Max G’Sell. "Bootstrapping and sample splitting for high-dimensional, assumption-lean inference." Ann. Statist. 47 (6) 3438 - 3469, December 2019. https://doi.org/10.1214/18-AOS1784

Information

Received: 1 April 2018; Revised: 1 November 2018; Published: December 2019
First available in Project Euclid: 31 October 2019

Digital Object Identifier: 10.1214/18-AOS1784

Subjects:
Primary: 62F35 , 62F40
Secondary: 62G09 , 62G20 , 62J05

Keywords: assumption-lean , bootstrap , regression , sample splitting

Rights: Copyright © 2019 Institute of Mathematical Statistics

Vol.47 • No. 6 • December 2019
Back to Top