Estimating a high-dimensional sparse covariance matrix from a limited number of samples is a fundamental task in contemporary data analysis. Most proposals to date, however, are not robust to outliers or heavy tails. Toward bridging this gap, in this work we consider estimating a sparse shape matrix from $n$ samples following a possibly heavy-tailed elliptical distribution. We propose estimators based on thresholding either Tyler’s M-estimator or its regularized variant. We prove that in the joint limit as the dimension $p$ and the sample size $n$ tend to infinity with $p/n\to\gamma>0$, our estimators are minimax rate optimal. Results on simulated data support our theoretical analysis.
References
Abramovich, Y. I. and Spencer, N. K. (2007). Diagonally loaded normalised sample matrix inversion (LNSMI) for outlier-resistant adaptive filtering. In Proceedings of the IEEE 32nd Intl. Conf. on Acoustics, Speech, and Signal Proc. (ICASSP) 1105–1108.Abramovich, Y. I. and Spencer, N. K. (2007). Diagonally loaded normalised sample matrix inversion (LNSMI) for outlier-resistant adaptive filtering. In Proceedings of the IEEE 32nd Intl. Conf. on Acoustics, Speech, and Signal Proc. (ICASSP) 1105–1108.
Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley Series in Probability and Statistics. Wiley, Hoboken, NJ. 1039.62044Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley Series in Probability and Statistics. Wiley, Hoboken, NJ. 1039.62044
Avella-Medina, M., Battey, H. S., Fan, J. and Li, Q. (2018). Robust estimation of high-dimensional covariance and precision matrices. Biometrika 105 271–284. 07072412 10.1093/biomet/asy011Avella-Medina, M., Battey, H. S., Fan, J. and Li, Q. (2018). Robust estimation of high-dimensional covariance and precision matrices. Biometrika 105 271–284. 07072412 10.1093/biomet/asy011
Balakrishnan, S., Du, S. S., Li, J. and Singh, A. (2017). Computationally efficient robust sparse estimation in high dimensions. In Conference on Learning Theory 169–212.Balakrishnan, S., Du, S. S., Li, J. and Singh, A. (2017). Computationally efficient robust sparse estimation in high dimensions. In Conference on Learning Theory 169–212.
Bickel, P. J. and Levina, E. (2008). Covariance regularization by thresholding. Ann. Statist. 36 2577–2604. MR2485008 1196.62062 10.1214/08-AOS600 euclid.aos/1231165180Bickel, P. J. and Levina, E. (2008). Covariance regularization by thresholding. Ann. Statist. 36 2577–2604. MR2485008 1196.62062 10.1214/08-AOS600 euclid.aos/1231165180
Cai, T. and Liu, W. (2011). Adaptive thresholding for sparse covariance matrix estimation. J. Amer. Statist. Assoc. 106 672–684. 1232.62086 10.1198/jasa.2011.tm10560Cai, T. and Liu, W. (2011). Adaptive thresholding for sparse covariance matrix estimation. J. Amer. Statist. Assoc. 106 672–684. 1232.62086 10.1198/jasa.2011.tm10560
Cai, T. T., Ren, Z. and Zhou, H. H. (2016). Estimating structured high-dimensional covariance and precision matrices: Optimal rates and adaptive estimation. Electron. J. Stat. 10 1–59. 1331.62272 10.1214/15-EJS1081Cai, T. T., Ren, Z. and Zhou, H. H. (2016). Estimating structured high-dimensional covariance and precision matrices: Optimal rates and adaptive estimation. Electron. J. Stat. 10 1–59. 1331.62272 10.1214/15-EJS1081
Cai, T. T. and Zhou, H. H. (2012a). Optimal rates of convergence for sparse covariance matrix estimation. Ann. Statist. 40 2389–2420. 1373.62247 10.1214/12-AOS998 euclid.aos/1359987525Cai, T. T. and Zhou, H. H. (2012a). Optimal rates of convergence for sparse covariance matrix estimation. Ann. Statist. 40 2389–2420. 1373.62247 10.1214/12-AOS998 euclid.aos/1359987525
Cai, T. T. and Zhou, H. H. (2012b). Minimax estimation of large covariance matrices under $\ell_{1}$-norm. Statist. Sinica 22 1319–1349. 1266.62036Cai, T. T. and Zhou, H. H. (2012b). Minimax estimation of large covariance matrices under $\ell_{1}$-norm. Statist. Sinica 22 1319–1349. 1266.62036
Cambanis, S., Huang, S. and Simons, G. (1981). On the theory of elliptically contoured distributions. J. Multivariate Anal. 11 368–385. 0469.60019 10.1016/0047-259X(81)90082-8Cambanis, S., Huang, S. and Simons, G. (1981). On the theory of elliptically contoured distributions. J. Multivariate Anal. 11 368–385. 0469.60019 10.1016/0047-259X(81)90082-8
Chen, M., Gao, C. and Ren, Z. (2018). Robust covariance and scatter matrix estimation under Huber’s contamination model. Ann. Statist. 46 1932–1960. 1408.62104 10.1214/17-AOS1607 euclid.aos/1534492824Chen, M., Gao, C. and Ren, Z. (2018). Robust covariance and scatter matrix estimation under Huber’s contamination model. Ann. Statist. 46 1932–1960. 1408.62104 10.1214/17-AOS1607 euclid.aos/1534492824
Chen, Y., Wiesel, A. and Hero, A. O. III (2011). Robust shrinkage estimation of high-dimensional covariance matrices. IEEE Trans. Signal Process. 59 4097–4107. 1391.62088 10.1109/TSP.2011.2138698Chen, Y., Wiesel, A. and Hero, A. O. III (2011). Robust shrinkage estimation of high-dimensional covariance matrices. IEEE Trans. Signal Process. 59 4097–4107. 1391.62088 10.1109/TSP.2011.2138698
Couillet, R., Kammoun, A. and Pascal, F. (2016). Second order statistics of robust estimators of scatter. Application to GLRT detection for elliptical signals. J. Multivariate Anal. 143 249–274. 1328.62332 10.1016/j.jmva.2015.08.021Couillet, R., Kammoun, A. and Pascal, F. (2016). Second order statistics of robust estimators of scatter. Application to GLRT detection for elliptical signals. J. Multivariate Anal. 143 249–274. 1328.62332 10.1016/j.jmva.2015.08.021
Couillet, R. and McKay, M. (2014). Large dimensional analysis and optimization of robust shrinkage covariance matrix estimators. J. Multivariate Anal. 131 99–120. 1306.62119 10.1016/j.jmva.2014.06.018Couillet, R. and McKay, M. (2014). Large dimensional analysis and optimization of robust shrinkage covariance matrix estimators. J. Multivariate Anal. 131 99–120. 1306.62119 10.1016/j.jmva.2014.06.018
Couillet, R., Pascal, F. and Silverstein, J. W. (2014). Robust estimates of covariance matrices in the large dimensional regime. IEEE Trans. Inform. Theory 60 7269–7278. 1360.62263 10.1109/TIT.2014.2354045Couillet, R., Pascal, F. and Silverstein, J. W. (2014). Robust estimates of covariance matrices in the large dimensional regime. IEEE Trans. Inform. Theory 60 7269–7278. 1360.62263 10.1109/TIT.2014.2354045
Couillet, R., Pascal, F. and Silverstein, J. W. (2015). The random matrix regime of Maronna’s M-estimator with elliptically distributed samples. J. Multivariate Anal. 139 56–78. 1320.62174 10.1016/j.jmva.2015.02.020Couillet, R., Pascal, F. and Silverstein, J. W. (2015). The random matrix regime of Maronna’s M-estimator with elliptically distributed samples. J. Multivariate Anal. 139 56–78. 1320.62174 10.1016/j.jmva.2015.02.020
Davidson, K. R. and Szarek, S. J. (2001). Local operator theory, random matrices and Banach spaces. In Handbook of the Geometry of Banach Spaces, Vol. I 317–366. North-Holland, Amsterdam. 1067.46008Davidson, K. R. and Szarek, S. J. (2001). Local operator theory, random matrices and Banach spaces. In Handbook of the Geometry of Banach Spaces, Vol. I 317–366. North-Holland, Amsterdam. 1067.46008
Dümbgen, L., Nordhausen, K. and Schuhmacher, H. (2016). New algorithms for $M$-estimation of multivariate scatter and location. J. Multivariate Anal. 144 200–217.Dümbgen, L., Nordhausen, K. and Schuhmacher, H. (2016). New algorithms for $M$-estimation of multivariate scatter and location. J. Multivariate Anal. 144 200–217.
El Karoui, N. (2008). Operator norm consistent estimation of large-dimensional sparse covariance matrices. Ann. Statist. 36 2717–2756. 1196.62064 10.1214/07-AOS559 euclid.aos/1231165183El Karoui, N. (2008). Operator norm consistent estimation of large-dimensional sparse covariance matrices. Ann. Statist. 36 2717–2756. 1196.62064 10.1214/07-AOS559 euclid.aos/1231165183
Falk, M. (2002). The sample covariance is not efficient for elliptical distributions. J. Multivariate Anal. 80 358–377. 0998.62052 10.1006/jmva.2000.1983Falk, M. (2002). The sample covariance is not efficient for elliptical distributions. J. Multivariate Anal. 80 358–377. 0998.62052 10.1006/jmva.2000.1983
Fang, K. T., Kotz, S. and Ng, K. W. (1990). Symmetric Multivariate and Related Distributions. Monographs on Statistics and Applied Probability 36. CRC Press, London. 0699.62048Fang, K. T., Kotz, S. and Ng, K. W. (1990). Symmetric Multivariate and Related Distributions. Monographs on Statistics and Applied Probability 36. CRC Press, London. 0699.62048
Frahm, G. and Jaekel, U. (2007). Tyler’s M-estimator, random matrix theory, and generalized elliptical distributions with applications to finance. Discussion Papers in Econometrics and Statistics, No. 2/07, Institute of Econometrics and Statistics, Univ. Cologne.Frahm, G. and Jaekel, U. (2007). Tyler’s M-estimator, random matrix theory, and generalized elliptical distributions with applications to finance. Discussion Papers in Econometrics and Statistics, No. 2/07, Institute of Econometrics and Statistics, Univ. Cologne.
Frahm, G. and Jaekel, U. (2010). A generalization of Tyler’s $M$-estimators to the case of incomplete data. Comput. Statist. Data Anal. 54 374–393. 05689596 10.1016/j.csda.2009.08.019Frahm, G. and Jaekel, U. (2010). A generalization of Tyler’s $M$-estimators to the case of incomplete data. Comput. Statist. Data Anal. 54 374–393. 05689596 10.1016/j.csda.2009.08.019
Goes, J., Lerman, G. and Nadler, B. (2020). Supplement to “Robust sparse covariance estimation by thresholding Tyler’s M-estimator.” https://doi.org/10.1214/18-AOS1793SUPP.Goes, J., Lerman, G. and Nadler, B. (2020). Supplement to “Robust sparse covariance estimation by thresholding Tyler’s M-estimator.” https://doi.org/10.1214/18-AOS1793SUPP.
Guionnet, A. and Zeitouni, O. (2000). Concentration of the spectral measure for large matrices. Electron. Commun. Probab. 5 119–136. 0969.15010 10.1214/ECP.v5-1026Guionnet, A. and Zeitouni, O. (2000). Concentration of the spectral measure for large matrices. Electron. Commun. Probab. 5 119–136. 0969.15010 10.1214/ECP.v5-1026
Han, F., Lu, J. and Liu, H. (2014). Robust scatter matrix estimation for high dimensional distributions with heavy tails. Technical Report, Princeton Univ., Princeton, NJ.Han, F., Lu, J. and Liu, H. (2014). Robust scatter matrix estimation for high dimensional distributions with heavy tails. Technical Report, Princeton Univ., Princeton, NJ.
Kammoun, A., Couillet, R., Pascal, F. and Alouini, M.-S. (2018). Optimal design of the adaptive normalized matched filter detector using regularized Tyler estimators. IEEE Trans. Aerosp. Electron. Syst. 54 755–769.Kammoun, A., Couillet, R., Pascal, F. and Alouini, M.-S. (2018). Optimal design of the adaptive normalized matched filter detector using regularized Tyler estimators. IEEE Trans. Aerosp. Electron. Syst. 54 755–769.
Kelker, D. (1970). Distribution theory of spherical distributions and a location-scale parameter generalization. Sankhyā Ser. A 32 419–438. 0223.60008Kelker, D. (1970). Distribution theory of spherical distributions and a location-scale parameter generalization. Sankhyā Ser. A 32 419–438. 0223.60008
Kent, J. T. and Tyler, D. E. (1991). Redescending $M$-estimates of multivariate location and scatter. Ann. Statist. 19 2102–2119. 0763.62030 10.1214/aos/1176348388 euclid.aos/1176348388Kent, J. T. and Tyler, D. E. (1991). Redescending $M$-estimates of multivariate location and scatter. Ann. Statist. 19 2102–2119. 0763.62030 10.1214/aos/1176348388 euclid.aos/1176348388
Lam, C. and Fan, J. (2009). Sparsistency and rates of convergence in large covariance matrix estimation. Ann. Statist. 37 4254–4278. 1191.62101 10.1214/09-AOS720 euclid.aos/1256303543Lam, C. and Fan, J. (2009). Sparsistency and rates of convergence in large covariance matrix estimation. Ann. Statist. 37 4254–4278. 1191.62101 10.1214/09-AOS720 euclid.aos/1256303543
Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). Multivariate Analysis. Academic Press, London. 0432.62029Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). Multivariate Analysis. Academic Press, London. 0432.62029
Maronna, R. A. (1976). Robust $M$-estimators of multivariate location and scatter. Ann. Statist. 4 51–67. 0322.62054 10.1214/aos/1176343347 euclid.aos/1176343347Maronna, R. A. (1976). Robust $M$-estimators of multivariate location and scatter. Ann. Statist. 4 51–67. 0322.62054 10.1214/aos/1176343347 euclid.aos/1176343347
Maronna, R. A. and Yohai, V. J. (2017). Robust and efficient estimation of multivariate scatter and location. Comput. Statist. Data Anal. 109 64–75. 06917820 10.1016/j.csda.2016.11.006Maronna, R. A. and Yohai, V. J. (2017). Robust and efficient estimation of multivariate scatter and location. Comput. Statist. Data Anal. 109 64–75. 06917820 10.1016/j.csda.2016.11.006
Morales-Jimenez, D., Couillet, R. and McKay, M. R. (2015). Large dimensional analysis of robust M-estimators of covariance with outliers. IEEE Trans. Signal Process. 63 5784–5797. 1394.62070 10.1109/TSP.2015.2460225Morales-Jimenez, D., Couillet, R. and McKay, M. R. (2015). Large dimensional analysis of robust M-estimators of covariance with outliers. IEEE Trans. Signal Process. 63 5784–5797. 1394.62070 10.1109/TSP.2015.2460225
Nordhausen, K. and Tyler, D. E. (2015). A cautionary note on robust covariance plug-in methods. Biometrika 102 573–588. 06519994 10.1093/biomet/asv022Nordhausen, K. and Tyler, D. E. (2015). A cautionary note on robust covariance plug-in methods. Biometrika 102 573–588. 06519994 10.1093/biomet/asv022
Ollila, E. and Koivunen, V. (2003). Robust antenna array processing using M-estimators of pseudo-covariance. In Proceedings of the IEEE 14th International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC) 2659–2663.Ollila, E. and Koivunen, V. (2003). Robust antenna array processing using M-estimators of pseudo-covariance. In Proceedings of the IEEE 14th International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC) 2659–2663.
Ollila, E. and Tyler, D. E. (2012). Distribution-free detection under complex elliptically symmetric clutter distribution. In IEEE 7th Sensor Array and Multichannel Signal Processing Workshop, (SAM) 413–416.Ollila, E. and Tyler, D. E. (2012). Distribution-free detection under complex elliptically symmetric clutter distribution. In IEEE 7th Sensor Array and Multichannel Signal Processing Workshop, (SAM) 413–416.
Ollila, E. and Tyler, D. E. (2014). Regularized $M$-estimators of scatter matrix. IEEE Trans. Signal Process. 62 6059–6070. MR3281544 1394.94435 10.1109/TSP.2014.2360826Ollila, E. and Tyler, D. E. (2014). Regularized $M$-estimators of scatter matrix. IEEE Trans. Signal Process. 62 6059–6070. MR3281544 1394.94435 10.1109/TSP.2014.2360826
Pascal, F., Chitour, Y. and Quek, Y. (2014). Generalized robust shrinkage estimator and its application to STAP detection problem. IEEE Trans. Signal Process. 62 5640–5651. 1394.94773 10.1109/TSP.2014.2355779Pascal, F., Chitour, Y. and Quek, Y. (2014). Generalized robust shrinkage estimator and its application to STAP detection problem. IEEE Trans. Signal Process. 62 5640–5651. 1394.94773 10.1109/TSP.2014.2355779
Rothman, A. J., Levina, E. and Zhu, J. (2009). Generalized thresholding of large covariance matrices. J. Amer. Statist. Assoc. 104 177–186. 1388.62170 10.1198/jasa.2009.0101Rothman, A. J., Levina, E. and Zhu, J. (2009). Generalized thresholding of large covariance matrices. J. Amer. Statist. Assoc. 104 177–186. 1388.62170 10.1198/jasa.2009.0101
Rudelson, M. and Vershynin, R. (2013). Hanson–Wright inequality and sub-Gaussian concentration. Electron. Commun. Probab. 18 no. 82, 9. 1329.60056 10.1214/ECP.v18-2865Rudelson, M. and Vershynin, R. (2013). Hanson–Wright inequality and sub-Gaussian concentration. Electron. Commun. Probab. 18 no. 82, 9. 1329.60056 10.1214/ECP.v18-2865
Sirkiä, S., Taskinen, S. and Oja, H. (2007). Symmetrised M-estimators of multivariate scatter. J. Multivariate Anal. 98 1611–1629. 1122.62048 10.1016/j.jmva.2007.06.005Sirkiä, S., Taskinen, S. and Oja, H. (2007). Symmetrised M-estimators of multivariate scatter. J. Multivariate Anal. 98 1611–1629. 1122.62048 10.1016/j.jmva.2007.06.005
Soloveychik, I. and Wiesel, A. (2014). Tyler’s covariance matrix estimator in elliptical models with convex structure. IEEE Trans. Signal Process. 62 5251–5259. 1394.94548 10.1109/TSP.2014.2348951Soloveychik, I. and Wiesel, A. (2014). Tyler’s covariance matrix estimator in elliptical models with convex structure. IEEE Trans. Signal Process. 62 5251–5259. 1394.94548 10.1109/TSP.2014.2348951
Sun, Y., Babu, P. and Palomar, D. P. (2014). Regularized Tyler’s scatter estimator: Existence, uniqueness, and algorithms. IEEE Trans. Signal Process. 62 5143–5156. 1394.94569 10.1109/TSP.2014.2348944Sun, Y., Babu, P. and Palomar, D. P. (2014). Regularized Tyler’s scatter estimator: Existence, uniqueness, and algorithms. IEEE Trans. Signal Process. 62 5143–5156. 1394.94569 10.1109/TSP.2014.2348944
Sun, Y., Babu, P. and Palomar, D. P. (2016). Robust estimation of structured covariance matrix for heavy-tailed elliptical distributions. IEEE Trans. Signal Process. 64 3576–3590. 1414.94596 10.1109/TSP.2016.2546222Sun, Y., Babu, P. and Palomar, D. P. (2016). Robust estimation of structured covariance matrix for heavy-tailed elliptical distributions. IEEE Trans. Signal Process. 64 3576–3590. 1414.94596 10.1109/TSP.2016.2546222
Tyler, D. E. (1987b). Statistical analysis for the angular central Gaussian distribution on the sphere. Biometrika 74 579–589. 0628.62054 10.1093/biomet/74.3.579Tyler, D. E. (1987b). Statistical analysis for the angular central Gaussian distribution on the sphere. Biometrika 74 579–589. 0628.62054 10.1093/biomet/74.3.579
Wiesel, A. (2012). Unified framework to regularized covariance estimation in scaled Gaussian models. IEEE Trans. Signal Process. 60 29–38. 1391.62094 10.1109/TSP.2011.2170685Wiesel, A. (2012). Unified framework to regularized covariance estimation in scaled Gaussian models. IEEE Trans. Signal Process. 60 29–38. 1391.62094 10.1109/TSP.2011.2170685
Wiesel, A. and Zhang, T. (2014). Structured robust covariance estimation. Found. Trends Signal Process. 8 127–216. 1343.62043 10.1561/2000000053Wiesel, A. and Zhang, T. (2014). Structured robust covariance estimation. Found. Trends Signal Process. 8 127–216. 1343.62043 10.1561/2000000053
Zhang, T., Cheng, X. and Singer, A. (2016). Marčenko–Pastur law for Tyler’s $M$-estimator. J. Multivariate Anal. 149 114–123. 1381.62109 10.1016/j.jmva.2016.03.010Zhang, T., Cheng, X. and Singer, A. (2016). Marčenko–Pastur law for Tyler’s $M$-estimator. J. Multivariate Anal. 149 114–123. 1381.62109 10.1016/j.jmva.2016.03.010