The Annals of Statistics

Volume 45, Number 5 (2017), 2274-2298.

### Nonasymptotic analysis of semiparametric regression models with high-dimensional parametric coefficients

#### Abstract

We consider a two-step projection based Lasso procedure for estimating a partially linear regression model where the number of coefficients in the linear component can exceed the sample size and these coefficients belong to the $l_{q}$-“balls” for $q\in[0,1]$. Our theoretical results regarding the properties of the estimators are nonasymptotic. In particular, we establish a new nonasymptotic “oracle” result: Although the error of the nonparametric projection *per se* (with respect to the prediction norm) has the scaling $t_{n}$ in the first step, it only contributes a scaling $t_{n}^{2}$ in the $l_{2}$-error of the second-step estimator for the linear coefficients. This new “oracle” result holds for a large family of nonparametric least squares procedures and regularized nonparametric least squares procedures for the first-step estimation and the driver behind it lies in the projection strategy. We specialize our analysis to the estimation of a semiparametric sample selection model and provide a simple method with theoretical guarantees for choosing the regularization parameter in practice.

Received: October 2015

Revised: November 2016

Subjects

Primary: 62J02: General nonlinear regression

Secondary: 62N01: Censored data models 62N02: Estimation 62G08: Nonparametric regression 62J12: Generalized linear models

Keywords

High-dimensional statistics Lasso nonasymptotic analysis partially linear models sample selection

