Annals of Statistics

High-dimensionality effects in the Markowitz problem and other quadratic programs with linear constraints: Risk underestimation

Noureddine El Karoui

Full-text: Open access


We first study the properties of solutions of quadratic programs with linear equality constraints whose parameters are estimated from data in the high-dimensional setting where p, the number of variables in the problem, is of the same order of magnitude as n, the number of observations used to estimate the parameters. The Markowitz problem in Finance is a subcase of our study. Assuming normality and independence of the observations we relate the efficient frontier computed empirically to the “true” efficient frontier. Our computations show that there is a separation of the errors induced by estimating the mean of the observations and estimating the covariance matrix. In particular, the price paid for estimating the covariance matrix is an underestimation of the variance by a factor roughly equal to 1−p/n. Therefore the risk of the optimal population solution is underestimated when we estimate it by solving a similar quadratic program with estimated parameters.

We also characterize the statistical behavior of linear functionals of the empirical optimal vector and show that they are biased estimators of the corresponding population quantities.

We investigate the robustness of our Gaussian results by extending the study to certain elliptical models and models where our n observations are correlated (in “time”). We show a lack of robustness of the Gaussian results, but are still able to get results concerning first order properties of the quantities of interest, even in the case of relatively heavy-tailed data (we require two moments). Risk underestimation is still present in the elliptical case and more pronounced than in the Gaussian case.

We discuss properties of the nonparametric and parametric bootstrap in this context. We show several results, including the interesting fact that standard applications of the bootstrap generally yield inconsistent estimates of bias.

We propose some strategies to correct these problems and practically validate them in some simulations. Throughout this paper, we will assume that p, n and np tend to infinity, and p<n.

Finally, we extend our study to the case of problems with more general linear constraints, including, in particular, inequality constraints.

Article information

Ann. Statist., Volume 38, Number 6 (2010), 3487-3566.

First available in Project Euclid: 30 November 2010

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62H10: Distribution of statistics
Secondary: 90C20: Quadratic programming

Covariance matrices convex optimization quadratic programs multivariate statistical analysis high-dimensional inference concentration of measure random matrix theory Markowitz problem Wishart matrices elliptical distributions


El Karoui, Noureddine. High-dimensionality effects in the Markowitz problem and other quadratic programs with linear constraints: Risk underestimation. Ann. Statist. 38 (2010), no. 6, 3487--3566. doi:10.1214/10-AOS795.

Export citation


  • Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley, Hoboken, NJ.
  • Bai, Z. D. (1999). Methodologies in spectral analysis of large-dimensional random matrices, a review. Statist. Sinica 9 611–677. With comments by G. J. Rodgers and Jack W. Silverstein; and a rejoinder by the author.
  • Bai, Z., Liu, H. and Wong, W.-K. (2009). Enhancement of the applicability of Markowitz’s portfolio optimization by utilizing random matrix theory. Math. Finance 19 639–667.
  • Bickel, P. J. and Levina, E. (2008a). Covariance regularization by thresholding. Ann. Statist. 36 2577–2604.
  • Bickel, P. J. and Levina, E. (2008b). Regularized estimation of large covariance matrices. Ann. Statist. 36 199–227.
  • Biroli, G., Bouchaud, J.-P. and Potters, M. (2007). The student ensemble of correlation matrices: Eigenvalue spectrum and Kullback–Leibler entropy. Acta Phys. Polon. B 38 4009–4026.
  • Black, F. and Litterman, R. (1990). Asset allocation: Combining investor views with market equilibrium. Golman Sachs Fixed Income Research.
  • Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge Univ. Press, Cambridge.
  • Campbell, J., Lo, A. and MacKinlay, C. (1996). The Econometrics of Financial Markets. Princeton Univ. Press, Princeton, NJ.
  • Chikuse, Y. (2003). Statistics on Special Manifolds. Lecture Notes in Statistics 174. Springer, New York.
  • Chow, Y. S. and Teicher, H. (1997). Probability Theory: Independence, Interchangeability, Martingales, 3rd ed. Springer, New York.
  • Davidson, K. R. and Szarek, S. J. (2001). Local operator theory, random matrices and Banach spaces. In Handbook of the Geometry of Banach Spaces, Vol. I 317–366. North-Holland, Amsterdam.
  • Eaton, M. L. (1983). Multivariate Statistics: A Vector Space Approach. Wiley, New York.
  • El Karoui, N. (2007). Tracy–Widom limit for the largest eigenvalue of a large class of complex sample covariance matrices. Ann. Probab. 35 663–714.
  • El Karoui, N. (2008). Operator norm consistent estimation of large dimensional sparse covariance matrices. Ann. Statist. 36 2717–2756.
  • El Karoui, N. (2009a). Concentration of measure and spectra of random matrices: Applications to correlation matrices, elliptical distributions and beyond. Ann. Appl. Probab. 19 2362–2405.
  • El Karoui, N. (2009b). On the realized risk of high-dimensional Markowitz portfolios. Technical Report No. 784, Dept. Statistics, Univ. California, Berkeley.
  • Fang, K. T., Kotz, S. and Ng, K. W. (1990). Symmetric Multivariate and Related Distributions. Monographs on Statistics and Applied Probability 36. Chapman and Hall, London.
  • Frahm, G. and Jaekel, U. (2005). Random matrix theory and robust covariance matrix estimation for financial data. Available at arXiv:physics/0503007.
  • Horn, R. A. and Johnson, C. R. (1990). Matrix Analysis. Cambridge Univ. Press, Cambridge. Corrected reprint of the 1985 original.
  • Horn, R. A. and Johnson, C. R. (1994). Topics in Matrix Analysis. Cambridge Univ. Press, Cambridge. Corrected reprint of the 1991 original.
  • Jobson, J. D. and Korkie, B. (1980). Estimation for Markowitz efficient portfolios. J. Amer. Statist. Assoc. 75 544–554.
  • Johnstone, I. (2001). On the distribution of the largest eigenvalue in principal component analysis. Ann. Statist. 29 295–327.
  • Kan, R. and Smith, D. R. (2008). The distribution of the sample minimum-variance frontier. Management Science 54 1364–1380.
  • Lai, T. L. and Xing, H. (2008). Statistical Models and Methods for Financial Markets. Springer Texts in Statistics. Springer, New York.
  • Laloux, L., Cizeau, P., Bouchaud, J.-P. and Potters, M. (2000). Random matrix theory and financial correlations. Internat. J. Theoret. Appl. Finance 3 391–397.
  • Ledoit, O. and Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. J. Multivariate Anal. 88 365–411.
  • Ledoux, M. (2001). The Concentration of Measure Phenomenon. Mathematical Surveys and Monographs 89. Amer. Math. Soc., Providence, RI.
  • Lugosi, G. (2006). Concentration of measure inequalities. Lecture notes available online.
  • Marčenko, V. A. and Pastur, L. A. (1967). Distribution of eigenvalues in certain sets of random matrices. Mat. Sb. (N.S.) 72 507–536.
  • Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). Multivariate Analysis. Academic Press, London.
  • Markowitz, H. (1952). Portfolio selection. J. Finance 7 77–91.
  • McNeil, A. J., Frey, R. and Embrechts, P. (2005). Quantitative Risk Management: Concepts, Techniques and Tools. Princeton Univ. Press, Princeton, NJ.
  • Meucci, A. (2005). Risk and Asset Allocation. Springer Finance. Springer, Berlin.
  • Meucci, A. (2008). Enhancing the Black–Litterman and related approaches: Views and stress-test on risk factors. Available at SSRN,
  • Michaud, R. O. (1998). Efficient Asset Management: A Practical Guide to Stock Portfolio Optimization and Asset Allocation. Oxford Univ. Press.
  • Pafka, S. and Kondor, I. (2003). Noisy covariance matrices and portfolio optimization. II. Phys. A 319 487–494.
  • Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation. Electron. J. Statist. 2 494–515 (electronic).
  • Ruppert, D. (2006). Statistics and Finance: An Introduction. Springer, New York. Corrected second printing of the 2004 original.
  • Silverstein, J. W. (1995). Strong convergence of the empirical distribution of eigenvalues of large-dimensional random matrices. J. Multivariate Anal. 55 331–339.
  • Tyler, D. E. (1987). A distribution-free M-estimator of multivariate scatter. Ann. Statist. 15 234–251.
  • van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Univ. Press, Cambridge.
  • Wachter, K. W. (1978). The strong limits of random matrix spectra for sample matrices of independent elements. Ann. Probab. 6 1–18.