August 2021 Second-order Stein: SURE for SURE and other applications in high-dimensional inference
Pierre C. Bellec, Cun-Hui Zhang
Author Affiliations +
Ann. Statist. 49(4): 1864-1903 (August 2021). DOI: 10.1214/20-AOS2005

Abstract

Stein’s formula states that a random variable of the form zf(z)divf(z) is mean-zero for all functions f with integrable gradient. Here, divf is the divergence of the function f and z is a standard normal vector. This paper aims to propose a second-order Stein formula to characterize the variance of such random variables for all functions f(z) with square integrable gradient, and to demonstrate the usefulness of this second-order Stein formula in various applications.

In the Gaussian sequence model, a remarkable consequence of Stein’s formula is Stein’s Unbiased Risk Estimate (SURE), an unbiased estimate of the mean squared risk for almost any given estimator μˆ of the unknown mean vector. A first application of the second-order Stein formula is an Unbiased Risk Estimate for SURE itself (SURE for SURE): an unbiased estimate providing information about the squared distance between SURE and the squared estimation error of μˆ. SURE for SURE has a simple form as a function of the data and is applicable to all μˆ with square integrable gradient, for example, the Lasso and the Elastic Net.

In addition to SURE for SURE, the following statistical applications are developed: (1) upper bounds on the risk of SURE when the estimation target is the mean squared error; (2) confidence regions based on SURE and using the second-order Stein formula; (3) oracle inequalities satisfied by SURE-tuned estimates under a mild Lipschtiz assumption; (4) an upper bound on the variance of the size of the model selected by the Lasso, and more generally an upper bound on the variance of the empirical degrees-of-freedom of convex penalized estimators; (5) explicit expressions of SURE for SURE for the Lasso and the Elastic Net; (6) in the linear model, a general semiparametric scheme to de-bias a differentiable initial estimator for the statistical inference of a low-dimensional projection of the unknown regression coefficient vector, with a characterization of the variance after debiasing; and (7) an accuracy analysis of a Gaussian Monte Carlo scheme to approximate the divergence of functions f:RnRn.

Funding Statement

The first author was supported in part by NSF Grant DMS-1811976 and NSF CAREER award DMS-1945428. The second author was supported in part by NSF Grants DMS-1513378, IIS-1407939, DMS-1721495, IIS-1741390 and CCF-1934924.

Acknowledgments

The first author thanks Julien Reygner and Yair Shenfeld for pointing out useful references regarding Section 2.4.

Citation

Download Citation

Pierre C. Bellec. Cun-Hui Zhang. "Second-order Stein: SURE for SURE and other applications in high-dimensional inference." Ann. Statist. 49 (4) 1864 - 1903, August 2021. https://doi.org/10.1214/20-AOS2005

Information

Received: 1 July 2019; Revised: 1 July 2020; Published: August 2021
First available in Project Euclid: 29 September 2021

MathSciNet: MR4319234
zbMATH: 1486.62209
Digital Object Identifier: 10.1214/20-AOS2005

Subjects:
Primary: 62H10 , 62H12 , 62J07
Secondary: 62F35 , 62G15

Keywords: debiased estimation , Elastic net , Lasso , Model selection , regression , risk estimate , Stein’s formula , SURE , SURE for SURE , Variance estimate , variance of model size

Rights: Copyright © 2021 Institute of Mathematical Statistics

JOURNAL ARTICLE
40 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.49 • No. 4 • August 2021
Back to Top