Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process

Hannes Leeb

doi:10.3150/08-BEJ127

August 2008 Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process

Hannes Leeb

Bernoulli 14(3): 661-690 (August 2008). DOI: 10.3150/08-BEJ127

Abstract

In regression with random design, we study the problem of selecting a model that performs well for out-of-sample prediction. We do not assume that any of the candidate models under consideration are correct. Our analysis is based on explicit finite-sample results. Our main findings differ from those of other analyses that are based on traditional large-sample limit approximations because we consider a situation where the sample size is small relative to the complexity of the data-generating process, in the sense that the number of parameters in a ‘good’ model is of the same order as sample size. Also, we allow for the case where the number of candidate models is (much) larger than sample size.

Citation

Download Citation

Hannes Leeb. "Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process." Bernoulli 14 (3) 661 - 690, August 2008. https://doi.org/10.3150/08-BEJ127

Information

Published: August 2008

First available in Project Euclid: 25 August 2008

zbMATH: 1155.62029

MathSciNet: MR2537807

Digital Object Identifier: 10.3150/08-BEJ127

Keywords: generalized cross validation , large number of parameters and small sample size , Model selection , Nonparametric regression , out-of-sample prediction , S_p criterion

Access the abstract

JOURNAL ARTICLE
30 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY