## Electronic Journal of Statistics

### Multidimensional linear functional estimation in sparse Gaussian models and robust estimation of the mean

#### Abstract

We consider two problems of estimation in high-dimensional Gaussian models. The first problem is that of estimating a linear functional of the means of $n$ independent $p$-dimensional Gaussian vectors, under the assumption that at most $s$ of the means are nonzero. We show that, up to a logarithmic factor, the minimax rate of estimation in squared Euclidean norm is between $(s^{2}\wedge n)+sp$ and $(s^{2}\wedge np)+sp$. The estimator that attains the upper bound being computationally demanding, we investigate suitable versions of group thresholding estimators that are efficiently computable even when the dimension and the sample size are very large. An interesting new phenomenon revealed by this investigation is that the group thresholding leads to a substantial improvement in the rate as compared to the element-wise thresholding. Thus, the rate of the group thresholding is $s^{2}\sqrt{p}+sp$, while the element-wise thresholding has an error of order $s^{2}p+sp$. To the best of our knowledge, this is the first known setting in which leveraging the group structure leads to a polynomial improvement in the rate.

The second problem studied in this work is the estimation of the common $p$-dimensional mean of the inliers among $n$ independent Gaussian vectors. We show that there is a strong analogy between this problem and the first one. Exploiting it, we propose new strategies of robust estimation that are computationally tractable and have better rates of convergence than the other computationally tractable robust (with respect to the presence of the outliers in the data) estimators studied in the literature. However, this tractability comes with a loss of the minimax-rate-optimality in some regimes.

#### Article information

Source
Electron. J. Statist., Volume 13, Number 2 (2019), 2830-2864.

Dates
First available in Project Euclid: 29 August 2019

https://projecteuclid.org/euclid.ejs/1567065621

Digital Object Identifier
doi:10.1214/19-EJS1590

Subjects
Primary: 62J05: Linear regression
Secondary: 62G05: Estimation

#### Citation

Collier, Olivier; Dalalyan, Arnak S. Multidimensional linear functional estimation in sparse Gaussian models and robust estimation of the mean. Electron. J. Statist. 13 (2019), no. 2, 2830--2864. doi:10.1214/19-EJS1590. https://projecteuclid.org/euclid.ejs/1567065621

#### References

• Balakrishnan, S., Du, S. S., Li, J. and Singh, A. (2017). Computationally Efficient Robust Sparse Estimation in High Dimensions. In, Proceedings of the 30th Conference on Learning Theory, COLT 2017, Amsterdam, The Netherlands, 7-10 July 2017 169–212.
• Balmand, S. and Dalalyan, A. S. (2015). Convex programming approach to robust estimation of a multivariate Gaussian model submitted No. 1512.04734, arXiv.
• Bickel, P. J. and Ritov, Y. (1988). Estimating Integrated Squared Density Derivatives: Sharp Best Order of Convergence Estimates., Sankhy?: The Indian Journal of Statistics, Series A (1961-2002) 50 381-393.
• Bunea, F., Lederer, J. and She, Y. (2014). The Group Square-Root Lasso: Theoretical Properties and Fast Algorithms., IEEE Transactions on Information Theory 60 1313-1325.
• Butucea, C. and Comte, F. (2009). Adaptive estimation of linear functionals in the convolution model and applications., Bernoulli 15 69–98.
• Cai, T. T. and Low, M. G. (2004). Minimax estimation of linear functionals over nonconvex parameter spaces., Ann. Statist. 32 552–576.
• Cai, T. T. and Low, M. G. (2005). On adaptive estimation of linear functionals., Ann. Statist. 33 2311–2343.
• Cai, T. T. and Low, M. G. (2006). Optimal adaptive estimation of a quadratic functional., Ann. Statist. 34 2298–2325.
• Cai, T. T. and Low, M. G. (2011). Testing composite hypotheses, Hermite polynomials and optimal estimation of a nonsmooth functional., Ann. Statist. 39 1012–1041.
• Carpentier, A., Delattre, S., Roquain, E. and Verzelen, N. (2018). Estimating minimum effect with outlier selection., arXiv e-prints arXiv:1809.08330.
• Chen, M., Gao, C. and Ren, Z. (2015). Robust Covariance and Scatter Matrix Estimation under Huber’s Contamination Model., ArXiv e-prints, to appear in the Annals of Statistics.
• Chen, M., Gao, C. and Ren, Z. (2016). A general decision theory for Huber’s $\epsilon$-contamination model., Electron. J. Statist. 10 3752–3774.
• Cherapanamjeri, Y., Gupta, K. and Jain, P. (2016). Nearly-optimal Robust Matrix Completion., CoRR abs/1606.07315.
• Chesneau, C. and Hebiri, M. (2008). Some theoretical results on the grouped variables Lasso., Math. Methods Statist. 17 317–326.
• Collier, O., Comminges, L. and Tsybakov, A. B. (2017). Minimax estimation of linear and quadratic functionals on sparsity classes., Ann. Statist. 45 923–958.
• Collier, O. and Dalalyan, A. S. (2015). Curve registration by nonparametric goodness-of-fit testing., J. Statist. Plann. Inference 162 20-42.
• Collier, O. and Dalalyan, A. S. (2018). Estimating linear functionals of a sparse family of Poisson means., Statistical Inference for Stochastic Processes.
• Collier, O., Comminges, L., Tsybakov, A. B. and Verzélen, N. (2016). Optimal adaptive estimation of linear functionals under sparsity., ArXiv e-prints, ArXiv:1611.09744.
• Comminges, L. and Dalalyan, A. S. (2012). Tight conditions for consistency of variable selection in the context of high dimensionality., Ann. Statist. 40 2667–2696.
• Comminges, L. and Dalalyan, A. S. (2013). Minimax testing of a composite null hypothesis defined via a quadratic functional in the model of regression., Electron. J. Statist. 7 146–190.
• Dalalyan, A. S. and Chen, Y. (2012). Fused sparsity and robust estimation for linear models with unknown variance. In, Advances in Neural Information Processing Systems 25: NIPS 1268-1276.
• Dalalyan, A. S. and Keriven, R. (2012). Robust estimation for an inverse problem arising in multiview geometry., J. Math. Imaging Vision 43 10–23.
• Devroye, L., Lerasle, M., Lugosi, G. and Oliveira, R. I. (2016). Sub-Gaussian mean estimators., Ann. Statist. 44 2695–2725.
• Diakonikolas, I., Kamath, G., Kane, D. M., Li, J., Moitra, A. and Stewart, A. (2016). Robust Estimators in High Dimensions without the Computational Intractability. In, IEEE 57th Annual Symposium on Foundations of Computer Science, FOCS 2016, USA 655–664.
• Donoho, D. L. and Liu, R. C. (1987)., On Minimax Estimation of Linear Functionals. Technical report (University of California, Berkeley. Department of Statistics). Department of Statistics, University of California.
• Donoho, D. and Montanari, A. (2016). High dimensional robust M-estimation: asymptotic variance via approximate message passing., Probability Theory and Related Fields 166 935–969.
• Donoho, D. L. and Nussbaum, M. (1990). Minimax quadratic estimation of a quadratic functional., Journal of Complexity 6 290 - 323.
• Efromovich, S. and Low, M. G. (1994). Adaptive estimates of linear functionals., Probability Theory and Related Fields 98 261–275.
• Goldenshluger, A. and Nemirovski, A. (1997). On spatial adaptive estimation of nonparametric regression., Math. Meth. Statistics 6 135–170.
• Golubev, Y. and Levit, B. (2004). An oracle approach to adaptive estimation of linear functionals in a Gaussian model., Mathematical Methods of Statistics 13 392–408.
• Huber, P. J. (1964). Robust Estimation of a Location Parameter., Ann. Math. Statist. 35 73–101.
• Juditsky, A. B. and Nemirovski, A. S. (2009). Nonparametric estimation by convex programming., Ann. Statist. 37 2278–2300.
• Klemela, J. and Tsybakov, A. B. (2001). Sharp Adaptive Estimation of Linear Functionals., Ann. Statist. 29 1567–1600.
• Klopp, O., Lounici, K. and Tsybakov, A. B. (2017). Robust matrix completion., Probability Theory and Related Fields 169 523–564.
• Koshevnik, Y. A. and Levit, B. Y. (1977). On a Non-Parametric Analogue of the Information Matrix., Theory of Probability & Its Applications 21 738-753.
• Lai, K. A., Rao, A. B. and Vempala, S. (2016). Agnostic Estimation of Mean and Covariance. In, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS) 665-674.
• Laurent, B., Ludena, C. and Prieur, C. (2008). Adaptive estimation of linear functionals by model selection., Electron. J. Statist. 2 993–1020.
• Laurent, B. and Massart, P. (2000). Adaptive estimation of a quadratic functional by model selection., Ann. Statist. 28 1302–1338.
• Lecué, G. and Lerasle, M. (2017). Learning from MOM’s principles: Le Cam’s approach., ArXiv e-prints.
• Lepski, O., Nemirovski, A. and Spokoiny, V. (1999). On estimation of the $L_r$ norm of a regression function., Probability Theory and Related Fields 113 221–253.
• Lepskii, O. V. (1991). On a Problem of Adaptive Estimation in Gaussian White Noise., Theory of Probability & Its Applications 35 454-466.
• Lerasle, M. and Oliveira, R. I. (2011). Robust empirical mean Estimators., ArXiv e-prints.
• Lin, Y. and Zhang, H. H. (2006). Component selection and smoothing in multivariate nonparametric regression., Ann. Statist. 34 2272–2297.
• Lounici, K., Pontil, M., van de Geer, S. and Tsybakov, A. B. (2011). Oracle inequalities and optimal inference under group sparsity., Ann. Statist. 39 2164–2204.
• Meier, L., van de Geer, S. and Bühlmann, P. (2009). High-dimensional additive modeling., Ann. Statist. 37 3779–3821.
• Minsker, S. (2015). Geometric median and robust estimation in Banach spaces., Bernoulli 21 2308–2335.
• Nguyen, N. H. and Tran, T. D. (2013). Robust Lasso With Missing and Grossly Corrupted Observations., IEEE Transactions on Information Theory 59 2036-2058.
• Vershynin, R. (2012)., Introduction to the non-asymptotic analysis of random matrices In Compressed Sensing: Theory and Applications 210-268. Cambridge University Press.
• Verzelen, N. and Gassiat, E. (2016). Adaptive estimation of High-Dimensional Signal-to-Noise Ratios., ArXiv e-prints.
• Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables., J. R. Stat. Soc. Ser. B Stat. Methodol. 68 49–67.