Bernoulli

The Dantzig selector and sparsity oracle inequalities

Vladimir Koltchinskii

Source: Bernoulli Volume 15, Number 3 (2009), 799-828.

Abstract

Let

Yj=f*(Xj)+ξj,  j=1, …, n,

where X, X1, …, Xn are i.i.d. random variables in a measurable space $(S,\mathcal{A})$ with distribution Π and ξ, ξ1, …, ξn are i.i.d. random variables with ${\mathbb{E}}\xi=0$ independent of (X1, …, Xn). Given a dictionary h1, …, hN: S↦ℝ, let fλ:=∑j=1Nλjhj, λ=(λ1, …, λN)∈ℝN. Given ɛ>0, define

̂Λɛ:={λ∈ℝN: max1≤kN|n−1j=1n(fλ(Xj)−Yj)hk(Xj)|≤ɛ}

and

̂λ:=̂λɛ∈Argminλ̂Λɛλ1.

In the case where f*:=fλ*, λ*∈ℝN, Candes and Tao [Ann. Statist. 35 (2007) 2313–2351] suggested using ̂λ as an estimator of λ*. They called this estimator “the Dantzig selector”. We study the properties of f̂λ as an estimator of f* for regression models with random design, extending some of the results of Candes and Tao (and providing alternative proofs of these results).

Keywords: Dantzig selector; oracle inequalities; regression; sparsity

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.
If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.bj/1251463282
Digital Object Identifier: doi:10.3150/09-BEJ187

References

Bickel, P., Ritov, Y. and Tsybakov, A. (2009). Simultaneous analysis of LASSO and Dantzig selector. Ann. Statist. To appear.
Bobkov, S. and Houdré, C. (1997). Isoperimetric constants for product probability measures. Ann. Probab. 25 184–205.
Bunea, F., Tsybakov, A. and Wegkamp, M. (2007). Sparsity oracle inequalities for the LASSO. Electron. J. Statist. 1 169–194.
Candes, E., Romberg, J. and Tao, T. (2006). Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inform. Theory 52 489–509.
Candes, E. and Tao, T. (2005). Decoding by linear programming. IEEE Trans. Inform. Theory 51 4203–4215.
Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when p is much larger than n. Ann. Statist. 35 2313–2351.
de la Pena, V. and Giné, E. (1998). Decoupling: From Dependence to Independence. New York: Springer.
Donoho, D.L. (2006a). For most large underdetermined systems of linear equations the minimal 1-norm solution is also the sparsest solution. Commun. Pure Appl. Math. 59 797–829.
Donoho, D.L. (2006b). For most large underdetermined systems of equations the minimal 1-norm near-solution approximates the sparsest near-solution. Commun. Pure Appl. Math. 59 907–934.
Koltchinskii, V. (2005). Model selection and aggregation in sparse classification problems. In Oberwolfach Reports: Meeting on Statistical and Probabilistic Methods of Model Selection, October 2005. European Mathematical Society Publishing House.
Koltchinskii, V. (2009). Sparsity in penalized empirical risk minimization. Ann. Inst. H. Poincaré Probab. Statist. 45 7–57.
Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces. New York: Springer.
Mendelson, S., Pajor, A. and Tomczak-Jaegermann, N. (2007). Reconstruction and subgaussian operators in asymptotic geometric analysis. Geom. Funct. Anal. 17 1248–1282.
Rudelson, M. and Vershynin, R. (2005). Geometric approach to error correcting codes and reconstruction of signals. Int. Math. Res. Not. 64 4019–4041.
van de Geer, S. (2008). High-dimensional generalized linear models and the Lasso. Ann. Statist. 36 614–645.
van der Vaart, A. and Wellner, J. (1996). Weak Convergence and Empirical Processes. New York: Springer.

2009 © Bernoulli Society for Mathematical Statistics and Probability