Statistical inference in sparse high-dimensional additive models

Karl Gregory; Enno Mammen; Martin Wahl

doi:10.1214/20-AOS2011

Abstract

In this paper, we discuss the estimation of a nonparametric component ${f_{1}}$ of a nonparametric additive model $Y={f_{1}}({X_{1}})+\cdots +{f_{q}}({X_{q}})+\epsilon$ . We allow the number q of additive components to grow to infinity and we make sparsity assumptions about the number of nonzero additive components. We compare this estimation problem with that of estimating ${f_{1}}$ in the oracle model $Z={f_{1}}({X_{1}})+\epsilon$ , for which the additive components ${f_{2}},\dots ,{f_{q}}$ are known. We construct a two-step presmoothing-and-resmoothing estimator of ${f_{1}}$ and state finite-sample bounds for the difference between our estimator and a corresponding smoothing estimator ${\hat{f}_{1}^{\text{(oracle)}}}$ in the oracle model. In an asymptotic setting, these bounds can be used to show asymptotic equivalence of our estimator and the oracle estimator; the paper thus shows that, asymptotically, under strong enough sparsity conditions, knowledge of ${f_{2}},\dots ,{f_{q}}$ has no effect on estimation accuracy. Our first step is to estimate ${f_{1}}$ with an undersmoothed estimator based on near-orthogonal projections with a group Lasso bias correction. In the second step, we construct pseudo responses $\hat{Y}$ by evaluating this undersmoothed estimator of ${f_{1}}$ at the design points and then apply the smoothing method of the oracle estimator ${\hat{f}_{1}^{\text{(oracle)}}}$ to the nonparametric regression problem with “responses” $\hat{Y}$ and covariates ${X_{1}}$ . Our mathematical exposition centers primarily on establishing properties of the presmoothing estimator. We present simulation results demonstrating close-to-oracle performance of our estimator in practical applications.

Funding Statement

Financial support by Deutsche Forschungsgemeinschaft (DFG) through the Research Training Group 1953 is gratefully acknowledged.

Citation

Download Citation

Karl Gregory. Enno Mammen. Martin Wahl. "Statistical inference in sparse high-dimensional additive models." Ann. Statist. 49 (3) 1514 - 1536, June 2021. https://doi.org/10.1214/20-AOS2011

Information

Received: 1 October 2019; Revised: 1 June 2020; Published: June 2021

First available in Project Euclid: 9 August 2021

MathSciNet: MR4302573

zbMATH: 1473.62135

Digital Object Identifier: 10.1214/20-AOS2011

Subjects:

Primary: 62G08

Secondary: 62G20

Keywords: Additive models , bias correction , Lasso , near-orthogonality , Nonparametric curve estimation

Abstract

Funding Statement

Citation

Information

KEYWORDS/PHRASES

PUBLICATION TITLE:

PUBLICATION YEARS