## The Annals of Statistics

### Adaptive estimation of the sparsity in the Gaussian vector model

#### Abstract

Consider the Gaussian vector model with mean value $\theta$. We study the twin problems of estimating the number $\Vert \theta \Vert_{0}$ of nonzero components of $\theta$ and testing whether $\Vert \theta \Vert_{0}$ is smaller than some value. For testing, we establish the minimax separation distances for this model and introduce a minimax adaptive test. Extensions to the case of unknown variance are also discussed. Rewriting the estimation of $\Vert \theta \Vert_{0}$ as a multiple testing problem of all hypotheses $\{\Vert \theta \Vert_{0}\leq q\}$, we both derive a new way of assessing the optimality of a sparsity estimator and we exhibit such an optimal procedure. This general approach provides a roadmap for estimating the complexity of the signal in various statistical models.

#### Article information

Source
Ann. Statist., Volume 47, Number 1 (2019), 93-126.

Dates
Revised: September 2017
First available in Project Euclid: 30 November 2018

https://projecteuclid.org/euclid.aos/1543568583

Digital Object Identifier
doi:10.1214/17-AOS1680

Mathematical Reviews number (MathSciNet)
MR3909928

Zentralblatt MATH identifier
07036196

Subjects
Primary: 62C20: Minimax procedures 62G10: Hypothesis testing

#### Citation

Carpentier, Alexandra; Verzelen, Nicolas. Adaptive estimation of the sparsity in the Gaussian vector model. Ann. Statist. 47 (2019), no. 1, 93--126. doi:10.1214/17-AOS1680. https://projecteuclid.org/euclid.aos/1543568583

#### References

• [1] Arias-Castro, E., Candès, E. J. and Plan, Y. (2011). Global testing under sparse alternatives: ANOVA, multiple comparisons and the higher criticism. Ann. Statist. 39 2533–2556.
• [2] Baraud, Y. (2002). Non-asymptotic minimax rates of testing in signal detection. Bernoulli 8 577–606.
• [3] Baraud, Y., Huet, S. and Laurent, B. (2005). Testing convex hypotheses on the mean of a Gaussian vector. Application to testing qualitative hypotheses on a regression function. Ann. Statist. 33 214–257.
• [4] Bühlmann, P. and van de Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, Heidelberg.
• [5] Cai, T. T. and Guo, Z. (2016). Accuracy assessment for high-dimensional linear regression. Preprint. Available at arXiv:1603.03474.
• [6] Cai, T. T. and Guo, Z. (2017). Confidence intervals for high-dimensional linear regression: Minimax rates and adaptivity. Ann. Statist. 45 615–646.
• [7] Cai, T. T. and Jin, J. (2010). Optimal rates of convergence for estimating the null density and proportion of nonnull effects in large-scale multiple testing. Ann. Statist. 38 100–145.
• [8] Cai, T. T., Jin, J. and Low, M. G. (2007). Estimation and confidence sets for sparse normal mixtures. Ann. Statist. 35 2421–2449.
• [9] Cai, T. T. and Low, M. G. (2004). An adaptation theory for nonparametric confidence intervals. Ann. Statist. 32 1805–1840.
• [10] Cai, T. T. and Low, M. G. (2006). Adaptive confidence balls. Ann. Statist. 34 202–228.
• [11] Cai, T. T. and Low, M. G. (2011). Testing composite hypotheses, Hermite polynomials and optimal estimation of a nonsmooth functional. Ann. Statist. 39 1012–1041.
• [12] Carpentier, A. (2015). Testing the regularity of a smooth signal. Bernoulli 21 465–488.
• [13] Carpentier, A. and Verzelen, N. (2019). Supplement to “Adaptive estimation of the sparsity in the Gaussian vector model.” DOI:10.1214/17-AOS1680SUPP.
• [14] Celisse, A. and Robin, S. (2010). A cross-validation based estimation of the proportion of true null hypotheses. J. Statist. Plann. Inference 140 3132–3147.
• [15] Collier, O., Comminges, L. and Tsybakov, A. B. (2017). Minimax estimation of linear and quadratic functionals on sparsity classes. Ann. Statist. 45 923–958.
• [16] Comminges, L. and Dalalyan, A. S. (2013). Minimax testing of a composite null hypothesis defined via a quadratic functional in the model of regression. Electron. J. Stat. 7 146–190.
• [17] Donoho, D. and Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures. Ann. Statist. 32 962–994.
• [18] Fromont, M., Lerasle, M. and Reynaud-Bouret, P. (2016). Family-wise separation rates for multiple testing. Ann. Statist. 44 2533–2563.
• [19] Gayraud, G. and Pouet, C. (2005). Adaptive minimax testing in the discrete regression scheme. Probab. Theory Related Fields 133 531–558.
• [20] Giné, E. and Nickl, R. (2015). Mathematical Foundations of Infinite-Dimensional Statistical Models 40. Cambridge Univ. Press, Cambridge.
• [21] Härdle, W., Kerkyacharian, G., Picard, D. and Tsybakov, A. (2012). Wavelets, Approximation, and Statistical Applications 129. Springer, New York.
• [22] Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning. Springer, New York.
• [23] Hoffmann, M. and Nickl, R. (2011). On adaptive inference and confidence bands. Ann. Statist. 39 2383–2409.
• [24] Ingster, Y., Tsybakov, A. and Verzelen, N. (2010). Detection boundary in sparse regression. Electron. J. Stat. 4 1476–1526.
• [25] Ingster, Yu. and Suslina, I. A. (2003). Nonparametric Goodness-of-Fit Testing Under Gaussian Models. Lecture Notes in Statistics 169. Springer, New York.
• [26] Jin, J. (2008). Proportion of non-zero normal means: Universal oracle equivalences and uniformly consistent estimators. J. R. Stat. Soc. Ser. B. Stat. Methodol. 70 461–493.
• [27] Jin, J. and Tony Cai, T. (2007). Estimating the null and the proportional of nonnull effects in large-scale multiple comparisons. J. Amer. Statist. Assoc. 102 495–506.
• [28] Juditsky, A. and Nemirovski, A. (2002). On nonparametric tests of positivity/monotonicity/convexity. Ann. Statist. 30 498–527.
• [29] Kalai, A. T., Moitra, A. and Valiant, G. (2012). Disentangling Gaussians. Commun. ACM 55 113–120.
• [30] Keshavan, R. H., Montanari, A. and Oh, S. (2010). Matrix completion from a few entries. IEEE Trans. Inform. Theory 56 2980–2998.
• [31] Langaas, M., Lindqvist, B. H. and Ferkingstad, E. (2005). Estimating the proportion of true null hypotheses, with application to DNA microarray data. J. R. Stat. Soc. Ser. B. Stat. Methodol. 67 555–572.
• [32] Lepski, O., Nemirovski, A. and Spokoiny, V. (1999). On estimation of the $L_{r}$ norm of a regression function. Probab. Theory Related Fields 113 221–253.
• [33] Li, J. and Siegmund, D. (2015). Higher criticism: $p$-values and criticism. Ann. Statist. 43 1323–1350.
• [34] Maher, B. (2008). Personal genomes: The case of the missing heritability. Nature 456 18–21.
• [35] Massart, P. (2007). Concentration Inequalities and Model Selection. Lecture Notes in Math. 1896. Springer, Berlin.
• [36] Meinshausen, N. and Rice, J. (2006). Estimating the proportion of false null hypotheses among a large number of independently tested hypotheses. Ann. Statist. 34 373–393.
• [37] Moscovich, A., Nadler, B. and Spiegelman, C. (2016). On the exact Berk–Jones statistics and their $p$-value calculation. Electron. J. Stat. 10 2329–2354.
• [38] Nickl, R. and van de Geer, S. (2013). Confidence sets in sparse regression. Ann. Statist. 41 2852–2876.
• [39] Patra, R. K. and Sen, B. (2015). Estimation of a two-component mixture model with applications to multiple testing. J. R. Stat. Soc. Ser. B. Stat. Methodol. 78 869–893.
• [40] Storey, J. D. (2002). A direct approach to false discovery rates. J. R. Stat. Soc. Ser. B. Stat. Methodol. 64 479–498.
• [41] Toro, R. et al. (2015). Genomic architecture of human neuroanatomical diversity. Mol. Psychiatry 20 1011–1016.
• [42] Verzelen, N. (2012). Minimax risks for sparse regressions: Ultra-high dimensional phenomenons. Electron. J. Stat. 6 38–90.

#### Supplemental materials

• Supplement to “Adaptive estimation of the sparsity in the Gaussian vector model”. Proofs of the results.