Electronic Journal of Statistics

Detection boundary in sparse regression

Yuri I. Ingster, Alexandre B. Tsybakov, and Nicolas Verzelen

Full-text: Open access

Abstract

We study the problem of detection of a p-dimensional sparse vector of parameters in the linear regression model with Gaussian noise. We establish the detection boundary, i.e., the necessary and sufficient conditions for the possibility of successful detection as both the sample size n and the dimension p tend to infinity. Testing procedures that achieve this boundary are also exhibited. Our results encompass the high-dimensional setting (pn). The main message is that, under some conditions, the detection boundary phenomenon that has been previously established for the Gaussian sequence model, extends to high-dimensional linear regression. Finally, we establish the detection boundaries when the variance of the noise is unknown. Interestingly, the rate of the detection boundary in high-dimensional setting with unknown variance can be different from the rate for the case of known variance.

Article information

Source
Electron. J. Statist. Volume 4 (2010), 1476-1526.

Dates
First available in Project Euclid: 22 December 2010

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1293028087

Digital Object Identifier
doi:10.1214/10-EJS589

Mathematical Reviews number (MathSciNet)
MR2747131

Zentralblatt MATH identifier
1329.62314

Subjects
Primary: 62J05: Linear regression
Secondary: 62G10: Hypothesis testing 62H20: Measures of association (correlation, canonical correlation, etc.) 62G05: Estimation 62G08: Nonparametric regression 62C20: Minimax procedures 62G20: Asymptotic properties

Keywords
High-dimensional regression detection boundary sparse vectors sparsity minimax hypothesis testing

Citation

Ingster, Yuri I.; Tsybakov, Alexandre B.; Verzelen, Nicolas. Detection boundary in sparse regression. Electron. J. Statist. 4 (2010), 1476--1526. doi:10.1214/10-EJS589. https://projecteuclid.org/euclid.ejs/1293028087


Export citation

References

  • [1] Aldous, D. (1985). Exchangeability and related topics. In, École d’été de probabilités de Saint-Flour, XIII—1983. Lecture Notes in Math., Vol. 1117. Springer, Berlin, 1–198.
  • [2] Arias-Castro, E., Candès, E. J., and Plan, Y. (2010). Global testing and sparse alternatives: Anova, multiple comparisons and the higher criticism., arXiv:1007.1434.
  • [3] Bickel, P. J., Ritov, Y., and Tsybakov, A. B. (2009). Simultaneous analysis of lasso and Dantzig selector., Ann. Statist. 37, 4, 1705–1732. http://dx.doi.org/10.1214/08-AOS620.
  • [4] Cai, T., Jin, J., and Low, M. G. (2007). Estimation and confidence sets for sparse normal mixtures., Ann. Statist. 35, 6, 2421–2449. http://dx.doi.org/10.1214/009053607000000334.
  • [5] Candes, E. J. and Tao, T. (2007). The Dantzig selector: statistical estimation when, p is much larger than n. Ann. Statist. 35, 6, 2313–2351.
  • [6] Davidson, K. R. and Szarek, S. J. (2001). Local operator theory, random matrices and Banach spaces. In, Handbook of the geometry of Banach spaces, Vol. I. North-Holland, Amsterdam, 317–366.
  • [7] Donoho, D. and Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures., Ann. Statist. 32, 3, 962–994. http://dx.doi.org/10.1214/009053604000000265.
  • [8] Donoho, D. and Jin, J. (2008). Higher criticism thresholding: Optimal feature selection when useful features are rare and weak., Proc. Natl. Acad. Sci. USA 105, 39, 14790–14795.
  • [9] Donoho, D. and Jin, J. (2009). Feature selection by higher criticism thresholding achieves the optimal phase diagram., Philos. Trans. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 367, 1906, 4449–4470. With electronic supplementary materials available online, http://dx.doi.org/10.1098/rsta.2009.0129.
  • [10] Donoho, D. L. (2006). Compressed sensing., IEEE Trans. Inform. Theory 52, 4, 1289–1306. http://dx.doi.org/10.1109/TIT.2006.871582.
  • [11] Hall, P. and Jin, J. (2010). Innovated higher criticism for detecting sparse signals in correlated noise., Ann. Statist. 38, 3, 1686–1732. http://dx.doi.org/10.1214/09-AOS764.
  • [12] Haupt, J., Castro, R., and Nowak, R. (2008). Adaptive discovery of sparse signals in noise. In, 42th Asilomar Conference on Signal, Systems and Computers, Pacific Grove, California.
  • [13] Haupt, J., Castro, R., and Nowak, R. (2010). Distilled sensing: Adaptive sampling for sparse detection and estimation., arXiv:1001.5311.
  • [14] Ingster, Y. I. (1994). Minimax testing of hypotheses on the density of a distribution for ellipsoids in, lp. Theory Probab. Appl. 39, 3, 530–553. http://dx.doi.org/10.1137/1139029.
  • [15] Ingster, Y. I. (1997). Some problems of hypothesis testing leading to infinitely divisible distributions., Math. Methods Statist. 6, 1, 47–69.
  • [16] Ingster, Y. I. (2001). Adaptive detection of a signal of growing dimension. I., Math. Methods Statist. 10, 4, 395–421 (2002). Meeting on Mathematical Statistics (Marseille, 2000).
  • [17] Ingster, Y. I. (2002). Adaptive detection of a signal of growing dimension. II., Math. Methods Statist. 11, 1, 37–68.
  • [18] Ingster, Y. I., Pouet, C., and Tsybakov, A. B. (2009a). Classification of sparse high-dimensional vectors., Philos. Trans. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 367, 1906, 4427–4448. http://dx.doi.org/10.1098/rsta.2009.0156.
  • [19] Ingster, Y. I., Pouet, C., and Tsybakov, A. B. (2009b). Sparse classification boundaries., arXiv:0903.4807.
  • [20] Ingster, Y. I. and Suslina, I. A. (2002). On the detection of a signal with a known shape in a multichannel system., Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI) 294, Veroyatn. i Stat. 5, 88–112, 261. http://dx.doi.org/10.1007/s10958-005-0133-z.
  • [21] Ingster, Y. I. and Suslina, I. A. (2003)., Nonparametric goodness-of-fit testing under Gaussian models. Lecture Notes in Statistics, Vol. 169. Springer-Verlag, New York.
  • [22] Jager, L. and Wellner, J. A. (2007). Goodness-of-fit tests via phi-divergences., Ann. Statist. 35, 5, 2018–2053. http://dx.doi.org/10.1214/0009053607000000244.
  • [23] Jin, J. (2003). Detecting and estimating sparse mixtures. Ph.D. thesis, Stanford, University.
  • [24] Jin, J. (2004). Detecting a target in very noisy data from multiple looks. In, A festschrift for Herman Rubin. IMS Lecture Notes Monogr. Ser., Vol. 45. Inst. Math. Statist., Beachwood, OH, 255–286. http://dx.doi.org/10.1214/lnms/1196285396.
  • [25] Petrov, V. V. (1995)., Limit theorems of probability theory. Oxford Studies in Probability, Vol. 4. The Clarendon Press Oxford University Press, New York. Sequences of independent random variables, Oxford Science Publications.
  • [26] Shorack, G. R. and Wellner, J. A. (1986)., Empirical processes with applications to statistics. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. John Wiley & Sons Inc., New York.
  • [27] Verzelen, N. (2010). Minimax risks for sparse regressions: Ultra-high-dimensional phenomenons., arXiv:1008.0526.
  • [28] Verzelen, N. and Villers, F. (2010). Goodness-of-fit tests for high-dimensional Gaussian linear models., Ann. Statist. 38, 2, 704–752.
  • [29] Wainwright, M. (2009a). Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting., IEEE Trans. Inform. Theory 55, 12, 5728–5741. http://dx.doi.org/10.1109/TIT.2009.2032816.
  • [30] Wainwright, M. (2009b). Sharp thresholds for high-dimensional and noisy sparsity recovery using, 1-constrained quadratic programming (lasso). IEEE Trans. Inform. Theory 55, 5, 2183–2202.