## The Annals of Statistics

### Large covariance estimation through elliptical factor models

#### Abstract

We propose a general Principal Orthogonal complEment Thresholding (POET) framework for large-scale covariance matrix estimation based on the approximate factor model. A set of high-level sufficient conditions for the procedure to achieve optimal rates of convergence under different matrix norms is established to better understand how POET works. Such a framework allows us to recover existing results for sub-Gaussian data in a more transparent way that only depends on the concentration properties of the sample covariance matrix. As a new theoretical contribution, for the first time, such a framework allows us to exploit conditional sparsity covariance structure for the heavy-tailed data. In particular, for the elliptical distribution, we propose a robust estimator based on the marginal and spatial Kendall’s tau to satisfy these conditions. In addition, we study conditional graphical model under the same framework. The technical tools developed in this paper are of general interest to high-dimensional principal component analysis. Thorough numerical results are also provided to back up the developed theory.

#### Article information

Source
Ann. Statist., Volume 46, Number 4 (2018), 1383-1414.

Dates
Revised: January 2017
First available in Project Euclid: 27 June 2018

https://projecteuclid.org/euclid.aos/1530086420

Digital Object Identifier
doi:10.1214/17-AOS1588

Mathematical Reviews number (MathSciNet)
MR3819104

Zentralblatt MATH identifier
06936465

Subjects
Primary: 62H25: Factor analysis and principal components; correspondence analysis
Secondary: 62H12: Estimation

#### Citation

Fan, Jianqing; Liu, Han; Wang, Weichen. Large covariance estimation through elliptical factor models. Ann. Statist. 46 (2018), no. 4, 1383--1414. doi:10.1214/17-AOS1588. https://projecteuclid.org/euclid.aos/1530086420

#### References

• Agarwal, A., Negahban, S. and Wainwright, M. J. (2012). Noisy matrix decomposition via convex relaxation: Optimal rates in high dimensions. Ann. Statist. 40 1171–1197.
• Amini, A. A. and Wainwright, M. J. (2009). High-dimensional analysis of semidefinite relaxations for sparse principal components. Ann. Statist. 37 2877–2921.
• Antoniadis, A. and Fan, J. (2001). Regularization of wavelet approximations. J. Amer. Statist. Assoc. 96 939–967.
• Bai, J. and Li, K. (2012). Statistical analysis of factor models of high dimension. Ann. Statist. 40 436–465.
• Bai, J. and Ng, S. (2002). Determining the number of factors in approximate factor models. Econometrica 70 191–221.
• Bai, J. and Ng, S. (2013). Principal components estimation and identification of static factors. J. Econometrics 176 18–29.
• Belloni, A. and Chernozhukov, V. (2011). $\ell_{1}$-penalized quantile regression in high-dimensional sparse models. Ann. Statist. 39 82–130.
• Beran, R. (1978). An efficient and robust adaptive estimator of location. Ann. Statist. 6 292–313.
• Berthet, Q. and Rigollet, P. (2013a). Optimal detection of sparse principal components in high dimension. Ann. Statist. 41 1780–1815.
• Berthet, Q. and Rigollet, P. (2013b). Complexity theoretic lower bounds for sparse principal component detection. In Proceedings of the 26th Annual Conference on Learning Theory 1046–1066. PMLR, Princeton, NJ.
• Bickel, P. J. (1982). On adaptive estimation. Ann. Statist. 10 647–671.
• Bickel, P. J. and Levina, E. (2008a). Covariance regularization by thresholding. Ann. Statist. 36 2577–2604.
• Bickel, P. J. and Levina, E. (2008b). Regularized estimation of large covariance matrices. Ann. Statist. 36 199–227.
• Birnbaum, A., Johnstone, I. M., Nadler, B. and Paul, D. (2013). Minimax bounds for sparse PCA with noisy high-dimensional data. Ann. Statist. 41 1055–1084.
• Cai, T. and Liu, W. (2011). Adaptive thresholding for sparse covariance matrix estimation. J. Amer. Statist. Assoc. 106 672–684.
• Cai, T., Liu, W. and Luo, X. (2011). A constrained $\ell_{1}$ minimization approach to sparse precision matrix estimation. J. Amer. Statist. Assoc. 106 594–607.
• Cai, T. T., Ma, Z. and Wu, Y. (2013). Sparse PCA: Optimal rates and adaptive estimation. Ann. Statist. 41 3074–3110.
• Cai, T., Ma, Z. and Wu, Y. (2015). Optimal estimation and rank detection for sparse spiked covariance matrices. Probab. Theory Related Fields 161 781–815.
• Cai, T. T., Ren, Z. and Zhou, H. H. (2013). Optimal rates of convergence for estimating Toeplitz covariance matrices. Probab. Theory Related Fields 156 101–143.
• Cai, T. T., Zhang, C.-H. and Zhou, H. H. (2010). Optimal rates of convergence for covariance matrix estimation. Ann. Statist. 38 2118–2144.
• Cai, T. T. and Zhou, H. H. (2012). Optimal rates of convergence for sparse covariance matrix estimation. Ann. Statist. 40 2389–2420.
• Cai, T. T., Li, H., Liu, W. and Xie, J. (2013). Covariate-adjusted precision matrix estimation with an application in genetical genomics. Biometrika 100 139–156.
• Catoni, O. (2012). Challenging the empirical mean and empirical variance: A deviation study. Ann. Inst. Henri Poincaré Probab. Stat. 48 1148–1185.
• Chamberlain, G. and Rothschild, M. (1983). Arbitrage, factor structure, and mean-variance analysis on large asset markets. Econometrica 51 1281–1304.
• Chandrasekaran, V., Parrilo, P. A. and Willsky, A. S. (2012). Latent variable graphical model selection via convex optimization. Ann. Statist. 40 1935–1967.
• Chandrasekaran, V., Sanghavi, S., Parrilo, P. A. and Willsky, A. S. (2011). Rank-sparsity incoherence for matrix decomposition. SIAM J. Optim. 21 572–596.
• Choi, K. and Marden, J. (1998). A multivariate version of Kendall’s $\tau$. J. Nonparametr. Stat. 9 261–293.
• Christensen, D. (2005). Fast algorithms for the calculation of Kendall’s $\tau$. Comput. Statist. 20 51–62.
• Čížek, P., Härdle, W. and Weron, R., eds. (2005). Statistical Tools for Finance and Insurance. Springer, Berlin.
• Croux, C., Ollila, E. and Oja, H. (2002). Sign and rank covariance matrices: Statistical properties and application to principal components analysis. In Statistical Data Analysis Based on the $L_{1}$-Norm and Related Methods (Neuchâtel, 2002) 257–269. Birkhäuser, Basel.
• Davis, C. and Kahan, W. M. (1970). The rotation of eigenvectors by a perturbation. III. SIAM J. Numer. Anal. 7 1–46.
• Dupuis Lozeron, E. and Victoria-Feser, M. P. (2010). Robust estimation of constrained covariance matrices for confirmatory factor analysis. Comput. Statist. Data Anal. 54 3020–3032.
• Dürre, A., Vogel, D. and Tyler, D. E. (2014). The spatial sign covariance matrix with unknown location. J. Multivariate Anal. 130 107–117.
• El Karoui, N. (2008). Operator norm consistent estimation of large-dimensional sparse covariance matrices. Ann. Statist. 36 2717–2756.
• Fan, J., Fan, Y. and Barut, E. (2014). Adaptive robust variable selection. Ann. Statist. 42 324–351.
• Fan, J., Fan, Y. and Lv, J. (2008). High dimensional covariance matrix estimation using a factor model. J. Econometrics 147 186–197.
• Fan, J., Feng, Y. and Wu, Y. (2009). Network exploration via the adaptive lasso and SCAD penalties. Ann. Appl. Stat. 3 521–541.
• Fan, J., Li, Q. and Wang, Y. (2017). Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions. J. R. Stat. Soc. Ser. B. Stat. Methodol. 79 247–265.
• Fan, J., Liao, Y. and Mincheva, M. (2011). High-dimensional covariance matrix estimation in approximate factor models. Ann. Statist. 39 3320–3356.
• Fan, J., Liao, Y. and Mincheva, M. (2013). Large covariance estimation by thresholding principal orthogonal complements. J. R. Stat. Soc. Ser. B. Stat. Methodol. 75 603–680. With 33 discussions by 57 authors and a reply by Fan, Liao and Mincheva.
• Fan, J., Liao, Y. and Wang, W. (2016). Projected principal component analysis in factor models. Ann. Statist. 44 219–254.
• Fan, J., Liu, H. and Wang, W. (2018). Supplement to “Large covariance estimation through elliptical factor models”. DOI:10.1214/17-AOS1588SUPP.
• Fan, J., Wang, W. and Zhong, Y. (2016). Robust covariance estimation for approximate factor models. Available at arXiv:1602.00719.
• Fan, J., Wang, W. and Zhong, Y. (2017). An $\ell_{\infty}$ eigenvector perturbation bound and its application to robust covariance estimation. Available at arXiv:1603.03516.
• Fan, J., Ke, Z. T., Liu, H. and Xia, L. (2015). QUADRO: A supervised dimension reduction method via Rayleigh quotient optimization. Ann. Statist. 43 1498–1534.
• Fan, J., Liu, H., Wang, W. and Zhu, Z. (2016). Heterogeneity adjustment with applications to graphical model inference. Available at arXiv:1602.05455.
• Fang, K. T., Kotz, S. and Ng, K. W. (1990). Symmetric Multivariate and Related Distributions. Monographs on Statistics and Applied Probability 36. Chapman & Hall, London.
• Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9 432–441.
• Hallin, M. and Paindaveine, D. (2006). Semiparametrically efficient rank-based inference for shape. I. Optimal rank-based tests for sphericity. Ann. Statist. 34 2707–2756.
• Hampel, F. R. (1974). The influence curve and its role in robust estimation. J. Amer. Statist. Assoc. 69 383–393.
• Han, F. and Liu, H. (2014). Scale-invariant sparse PCA on high-dimensional meta-elliptical data. J. Amer. Statist. Assoc. 109 275–287.
• Han, F. and Liu, H. (2018). ECA: High dimensional elliptical component analysis in non-Gaussian distributions. J. Amer. Statist. Assoc. 113 252–268.
• Han, F. and Liu, H. (2017). Statistical analysis of latent generalized correlation matrix estimation in transelliptical distribution. Bernoulli 23 23–57.
• Han, F., Lu, J. and Liu, H. (2014). Robust scatter matrix estimation for high dimensional distributions with heavy tails. Technical report.
• Hsu, D., Kakade, S. M. and Zhang, T. (2011). Robust matrix decomposition with sparse corruptions. IEEE Trans. Inform. Theory 57 7221–7234.
• Hsu, D. and Sabato, S. (2014). Heavy-tailed regression with a generalized median-of-means. In Proceedings of the 31st International Conference on Machine Learning (ICML-14) 37–45. PMLR, Bejing, China.
• Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Stat. 35 73–101.
• Hult, H. and Lindskog, F. (2002). Multivariate extremes, aggregation and dependence in elliptical distributions. Adv. in Appl. Probab. 34 587–608.
• Johnstone, I. M. and Lu, A. Y. (2009). On consistency and sparsity for principal components analysis in high dimensions. J. Amer. Statist. Assoc. 104 682–693.
• Kendall, M. G. (1948). Rank Correlation Methods. C. Griffin, London.
• Knight, W. R. (1966). A computer method for calculating Kendall’s tau with ungrouped data. J. Amer. Statist. Assoc. 61 436–439.
• Koenker, R. (2005). Quantile Regression. Econometric Society Monographs 38. Cambridge Univ. Press Cambridge.
• Lam, C. and Fan, J. (2009). Sparsistency and rates of convergence in large covariance matrix estimation. Ann. Statist. 37 4254–4278.
• Levina, E. and Vershynin, R. (2012). Partial estimation of covariance matrices. Probab. Theory Related Fields 153 405–419.
• Liu, H., Han, F. and Zhang, C. (2012). Transelliptical graphical models. In Advances in Neural Information Processing Systems 800–808. Curran Associates, Inc., Lake Tahoe, NV.
• Liu, L., Hawkins, D. M., Ghosh, S. and Young, S. S. (2003). Robust singular value decomposition analysis of microarray data. Proc. Natl. Acad. Sci. USA 100 13167–13172.
• Ma, Z. (2013). Sparse principal component analysis and iterative thresholding. Ann. Statist. 41 772–801.
• Marden, J. I. (1999). Some robust estimates of principal components. Statist. Probab. Lett. 43 349–359.
• Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436–1462.
• Mitra, R. and Zhang, C.-H. (2014). Multivariate analysis of nonparametric estimates of large correlation matrices. Available at arXiv:1403.6195.
• Pang, H., Liu, H. and Vanderbei, R. (2014). The fastclime package for linear programming and large-scale precision matrix estimation in R. J. Mach. Learn. Res. 15 489–493.
• Paul, D. and Johnstone, I. M. (2012). Augmented sparse principal component analysis for high dimensional data. Available at arXiv:1202.1242.
• Pison, G., Rousseeuw, P. J., Filzmoser, P. and Croux, C. (2003). Robust factor analysis. J. Multivariate Anal. 84 145–172.
• Posekany, A., Felsenstein, K. and Sykacek, P. (2011). Biological assessment of robust noise models in microarray data analysis. Bioinformatics 27 807–814.
• Rachev, S. T. (2003). Handbook of Heavy Tailed Distributions in Finance. Handbooks in Finance 1. Elsevier, Amsterdam.
• Ravikumar, P., Wainwright, M. J., Raskutti, G. and Yu, B. (2011). High-dimensional covariance estimation by minimizing $\ell_{1}$-penalized log-determinant divergence. Electron. J. Stat. 5 935–980.
• Rothman, A. J., Levina, E. and Zhu, J. (2009). Generalized thresholding of large covariance matrices. J. Amer. Statist. Assoc. 104 177–186.
• Rothman, A. J., Levina, E. and Zhu, J. (2010). Sparse multivariate regression with covariance estimation. J. Comput. Graph. Statist. 19 947–962. Supplementary materials available online.
• Rousseeuw, P. J. and Croux, C. (1993). Alternatives to the median absolute deviation. J. Amer. Statist. Assoc. 88 1273–1283.
• Ruttimann, U. E., Unser, M., Rawlings, R. R., Rio, D., Ramsey, N. F., Mattay, V. S., Hommer, D. W., Frank, J. A. and Weinberger, D. R. (1998). Statistical analysis of functional MRI data in the wavelet domain. IEEE Trans. Med. Imag. 17 142–154.
• Shen, D., Shen, H. and Marron, J. S. (2013). Consistency of sparse PCA in high dimension, low sample size contexts. J. Multivariate Anal. 115 317–333.
• Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer, New York. Revised and extended from the 2004 French original. Translated by Vladimir Zaiats.
• Tyler, D. E. (1982). Radial estimates and the test for sphericity. Biometrika 69 429–436.
• Vanderbei, R. J. (2008). Linear Programming: Foundations and Extensions, 3rd ed. International Series in Operations Research & Management Science 114. Springer, New York.
• Vershynin, R. (2012). Introduction to the non-asymptotic analysis of random matrices. In Compressed Sensing 210–268. Cambridge Univ. Press Cambridge.
• Visuri, S., Koivunen, V. and Oja, H. (2000). Sign and rank covariance matrices. J. Statist. Plann. Inference 91 557–575.
• Vogel, D. and Fried, R. (2011). Elliptical graphical modelling. Biometrika 98 935–951.
• Vu, V. Q. and Lei, J. (2013). Minimax sparse principal subspace estimation in high dimensions. Ann. Statist. 41 2905–2947.
• Wang, W. and Fan, J. (2017). Asymptotics of empirical eigenstructure for high dimensional spiked covariance. Ann. Statist. 45 1342–1374.
• Wegkamp, M. and Zhao, Y. (2016). Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas. Bernoulli 22 1184–1226.
• Wu, Y. and Liu, Y. (2009). Variable selection in quantile regression. Statist. Sinica 19 801–817.
• Yuan, M. (2010). High dimensional inverse covariance matrix estimation via linear programming. J. Mach. Learn. Res. 11 2261–2286.
• Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika 94 19–35.
• Zhao, T., Roeder, K. and Liu, H. (2014). Positive semidefinite rank-based correlation matrix estimation with application to semiparametric graph estimation. J. Comput. Graph. Statist. 23 895–922.
• Zou, H., Hastie, T. and Tibshirani, R. (2006). Sparse principal component analysis. J. Comput. Graph. Statist. 15 265–286.
• Zou, H. and Yuan, M. (2008). Composite quantile regression and the oracle model selection theory. Ann. Statist. 36 1108–1126.

#### Supplemental materials

• Technical proofs. This supplementary material contains all the remaining proofs and technical lemmas and the comparison of relative error norms.