Electronic Journal of Statistics

Confidence regions and minimax rates in outlier-robust estimation on the probability simplex

Amir-Hossein Bateni and Arnak S. Dalalyan

Full-text: Open access

Abstract

We consider the problem of estimating the mean of a distribution supported by the $k$-dimensional probability simplex in the setting where an $\varepsilon$ fraction of observations are subject to adversarial corruption. A simple particular example is the problem of estimating the distribution of a discrete random variable. Assuming that the discrete variable takes $k$ values, the unknown parameter $\boldsymbol{\theta}$ is a $k$-dimensional vector belonging to the probability simplex. We first describe various settings of contamination and discuss the relation between these settings. We then establish minimax rates when the quality of estimation is measured by the total-variation distance, the Hellinger distance, or the $\mathbb{L}^{2}$-distance between two probability measures. We also provide confidence regions for the unknown mean that shrink at the minimax rate. Our analysis reveals that the minimax rates associated to these three distances are all different, but they are all attained by the sample average. Furthermore, we show that the latter is adaptive to the possible sparsity of the unknown vector. Some numerical experiments illustrating our theoretical findings are reported.

Article information

Source
Electron. J. Statist., Volume 14, Number 2 (2020), 2653-2677.

Dates
Received: January 2020
First available in Project Euclid: 18 July 2020

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1595037616

Digital Object Identifier
doi:10.1214/20-EJS1731

Subjects
Primary: 62F35: Robustness and adaptive procedures
Secondary: 62H12: Estimation

Keywords
Robust estimation discrete models confidence regions

Rights
Creative Commons Attribution 4.0 International License.

Citation

Bateni, Amir-Hossein; Dalalyan, Arnak S. Confidence regions and minimax rates in outlier-robust estimation on the probability simplex. Electron. J. Statist. 14 (2020), no. 2, 2653--2677. doi:10.1214/20-EJS1731. https://projecteuclid.org/euclid.ejs/1595037616


Export citation

References

  • J.-Y. Audibert and O. Catoni. Robust linear least squares regression., Ann. Statist., 39(5) :2766–2794, 10 2011.
  • S. Balakrishnan, S. S. Du, J. Li, and A. Singh. Computationally efficient robust sparse estimation in high dimensions. In, Conference on Learning Theory, pages 169–212, 2017.
  • P. C. Bellec. Adaptive confidence sets in shape restricted regression., arXiv preprint arXiv:1601.05766, 2016.
  • P. C. Bellec. Sharp oracle inequalities for least squares estimators in shape restricted regression., Ann. Statist., 46(2):745–780, 04 2018.
  • K. Bhatia, P. Jain, P. Kamalaruban, and P. Kar. Consistent robust regression. In, Advances in Neural Information Processing Systems, pages 2110–2119, 2017.
  • S. Boucheron, G. Lugosi, and P. Massart., Concentration Inequalities: A Nonasymptotic Theory of Independence. OUP, Oxford, 2013.
  • D. Braess and T. Sauer. Bernstein polynomials and learning theory., Journal of Approximation Theory, 128(2):187–206, 2004.
  • V.-E. Brunel. Concentration of the empirical level sets of Tukey’s halfspace depth., Probability Theory and Related Fields, 173(3-4) :1165–1196, 2019.
  • T. T. Cai and M. G. Low. Adaptive confidence balls., Ann. Statist., 34(1):202–228, 2006.
  • A. Carpentier, S. Delattre, E. Roquain, and N. Verzelen. Estimating minimum effect with outlier selection., arXiv e-prints arXiv:1809.08330, Sept. 2018.
  • M. Chen, C. Gao, and Z. Ren. A general decision theory for Huber’s $\varepsilon $-contamination model., Electronic Journal of Statistics, 10(2) :3752–3774, 2016.
  • M. Chen, C. Gao, and Z. Ren. Robust covariance and scatter matrix estimation under Huber’s contamination model., Ann. Statist., 46(5) :1932–1960, 10 2018.
  • S. Chen, J. Li, and A. Moitra. Efficiently learning structured distributions from untrusted batches. In, Proccedings of STOC 2020, June 22-26, pages 960–973. ACM, 2020.
  • Y. Chen, C. Caramanis, and S. Mannor. Robust sparse regression under adversarial corruption. In, International Conference on Machine Learning, pages 774–782, 2013.
  • G. Chinot, G. Lecué, and M. Lerasle. Statistical learning with Lipschitz and convex loss functions., arXiv e-prints arXiv:1810.01090, Oct. 2018.
  • O. Collier and A. S. Dalalyan. Multidimensional linear functional estimation in sparse gaussian models and robust estimation of the mean., Electron. J. Statist., 13(2) :2830–2864, 2019.
  • A. Dalalyan and Y. Chen. Fused sparsity and robust estimation for linear models with unknown variance. In, Advances in Neural Information Processing Systems 25, pages 1259–1267. Curran Associates, Inc., 2012.
  • A. S. Dalalyan and A. Minasyan. All-in-one robust estimator of the gaussian mean., math.ST, arXiv:2002.01432, 2020.
  • A. S. Dalalyan and M. Sebbar. Optimal Kullback-Leibler aggregation in mixture density estimation by maximum likelihood., Math. Stat. Learn., 1(1):1–35, 2018.
  • L. Devroye, M. Lerasle, G. Lugosi, and R. I. Oliveira. Sub-gaussian mean estimators., Ann. Statist., 44(6) :2695–2725, 12 2016.
  • I. Diakonikolas, G. Kamath, D. M. Kane, J. Li, A. Moitra, and A. Stewart. Robust estimators in high dimensions without the computational intractability. In, IEEE 57th Annual Symposium on Foundations of Computer Science, FOCS 2016, USA, pages 655–664, 2016.
  • I. Diakonikolas, G. Kamath, D. M. Kane, J. Li, A. Moitra, and A. Stewart. Being robust (in high dimensions) can be practical. In, Proceedings of ICML 2017, pages 999–1008, 2017.
  • I. Diakonikolas, J. Li, and L. Schmidt. Fast and sample near-optimal algorithms for learning multidimensional histograms. In, COLT 2018, Stockholm, Sweden, 6-9 July 2018., pages 819–842, 2018.
  • D. L. Donoho and M. Gasko. Breakdown properties of location estimates based on halfspace depth and projected outlyingness., Ann. Statist., 20(4) :1803–1827, 1992.
  • D. L. Donoho and P. J. Huber. The notion of breakdown point. Festschr. for Erich L. Lehmann, 157–184, 1983.
  • D. L. Donoho and A. Montanari. High dimensional robust m-estimation: asymptotic variance via approximate message passing., Probability Theory and Related Fields, 166(3):935–969, Dec 2016.
  • J. Feng, H. Xu, S. Mannor, and S. Yan. Robust logistic regression and classification. In, Advances in Neural Information Processing Systems 27, pages 253–261. Curran Associates, Inc., 2014.
  • A. Guntuboyina and B. Sen. Nonparametric shape-restricted regression., Statist. Sci., 33(4):568–594, 11 2018.
  • M. Hoffmann and R. Nickl. On adaptive inference and confidence bands., Ann. Statist., 39(5) :2383–2409, 2011.
  • P. J. Huber. Robust estimation of a location parameter., Ann. Math. Statist., 35(1):73–101, 1964.
  • A. Jain and A. Orlitsky. Robust learning of discrete distributions from batches., CoRR, arXiv:1911.08532, 2019.
  • E. Joly, G. Lugosi, and R. Imbuzeiro Oliveira. On the estimation of the mean of a random vector., Electron. J. Statist., 11(1):440–451, 2017.
  • S. Kamath, A. Orlitsky, D. Pichapati, and A. T. Suresh. On learning distributions from their samples. In, Proceedings of The 28th Conference on Learning Theory, COLT 2015, Paris, France, July 3–6, 2015, volume 40 of JMLR Workshop and Conference Proceedings, pages 1066–1100. JMLR.org, 2015.
  • K. A. Lai, A. B. Rao, and S. Vempala. Agnostic estimation of mean and covariance. In, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pages 665–674, Oct 2016.
  • G. Lecué and M. Lerasle. Learning from MOM’s principles: Le Cam’s approach., Stoch. Proc. App., 129(11) :4385–4410, 2019.
  • H. Liu and C. Gao. Density estimation with contamination: minimax rates and theory of adaptation., Electron. J. Stat., 13(2) :3613–3653, 2019.
  • G. Lugosi and S. Mendelson. Sub-gaussian estimators of the mean of a random vector., Ann. Statist., 47(2):783–794, 04 2019.
  • S. Minsker. Geometric median and robust estimation in banach spaces., Bernoulli, 21(4) :2308–2335, 11 2015.
  • S. Minsker. Sub-gaussian estimators of the mean of a random matrix with heavy-tailed entries., Ann. Statist., 46(6A) :2871–2903, 12 2018.
  • N. H. Nguyen and T. D. Tran. Robust lasso with missing and grossly corrupted observations., IEEE Trans. Inform. Theory, 59(4) :2036–2058, 2013.
  • M. Qiao and G. Valiant. Learning discrete distributions from untrusted batches. In, 9th Innovations in Theoretical Computer Science Conference, ITCS 2018, January 11–14, 2018, Cambridge, MA, USA, volume 94 of LIPIcs, pages 47:1–47:20. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2018.
  • P. J. Rousseeuw and M. Hubert. Regression depth., Journal of the American Statistical Association, 94(446):388–402, 1999.
  • A. B. Tsybakov., Introduction to Nonparametric Estimation. Springer, New York, 2009.
  • J. W. Tukey. Mathematics and the picturing of data. In, Proc. int. Congr. Math., Vancouver 1974, volume 2, pages 523–531, 1975.
  • D. Xia and V. Koltchinskii. Estimation of low rank density matrices: Bounds in schatten norms and other distances., Electron. J. Statist., 10(2) :2717–2745, 2016.