Stein’s formula states that a random variable of the form is mean-zero for all functions f with integrable gradient. Here, is the divergence of the function f and is a standard normal vector. This paper aims to propose a second-order Stein formula to characterize the variance of such random variables for all functions with square integrable gradient, and to demonstrate the usefulness of this second-order Stein formula in various applications.
In the Gaussian sequence model, a remarkable consequence of Stein’s formula is Stein’s Unbiased Risk Estimate (SURE), an unbiased estimate of the mean squared risk for almost any given estimator of the unknown mean vector. A first application of the second-order Stein formula is an Unbiased Risk Estimate for SURE itself (SURE for SURE): an unbiased estimate providing information about the squared distance between SURE and the squared estimation error of . SURE for SURE has a simple form as a function of the data and is applicable to all with square integrable gradient, for example, the Lasso and the Elastic Net.
In addition to SURE for SURE, the following statistical applications are developed: (1) upper bounds on the risk of SURE when the estimation target is the mean squared error; (2) confidence regions based on SURE and using the second-order Stein formula; (3) oracle inequalities satisfied by SURE-tuned estimates under a mild Lipschtiz assumption; (4) an upper bound on the variance of the size of the model selected by the Lasso, and more generally an upper bound on the variance of the empirical degrees-of-freedom of convex penalized estimators; (5) explicit expressions of SURE for SURE for the Lasso and the Elastic Net; (6) in the linear model, a general semiparametric scheme to de-bias a differentiable initial estimator for the statistical inference of a low-dimensional projection of the unknown regression coefficient vector, with a characterization of the variance after debiasing; and (7) an accuracy analysis of a Gaussian Monte Carlo scheme to approximate the divergence of functions .
The first author was supported in part by NSF Grant DMS-1811976 and NSF CAREER award DMS-1945428. The second author was supported in part by NSF Grants DMS-1513378, IIS-1407939, DMS-1721495, IIS-1741390 and CCF-1934924.
The first author thanks Julien Reygner and Yair Shenfeld for pointing out useful references regarding Section 2.4.
"Second-order Stein: SURE for SURE and other applications in high-dimensional inference." Ann. Statist. 49 (4) 1864 - 1903, August 2021. https://doi.org/10.1214/20-AOS2005