Open Access
December 2010 High-dimensionality effects in the Markowitz problem and other quadratic programs with linear constraints: Risk underestimation
Noureddine El Karoui
Ann. Statist. 38(6): 3487-3566 (December 2010). DOI: 10.1214/10-AOS795


We first study the properties of solutions of quadratic programs with linear equality constraints whose parameters are estimated from data in the high-dimensional setting where p, the number of variables in the problem, is of the same order of magnitude as n, the number of observations used to estimate the parameters. The Markowitz problem in Finance is a subcase of our study. Assuming normality and independence of the observations we relate the efficient frontier computed empirically to the “true” efficient frontier. Our computations show that there is a separation of the errors induced by estimating the mean of the observations and estimating the covariance matrix. In particular, the price paid for estimating the covariance matrix is an underestimation of the variance by a factor roughly equal to 1−p/n. Therefore the risk of the optimal population solution is underestimated when we estimate it by solving a similar quadratic program with estimated parameters.

We also characterize the statistical behavior of linear functionals of the empirical optimal vector and show that they are biased estimators of the corresponding population quantities.

We investigate the robustness of our Gaussian results by extending the study to certain elliptical models and models where our n observations are correlated (in “time”). We show a lack of robustness of the Gaussian results, but are still able to get results concerning first order properties of the quantities of interest, even in the case of relatively heavy-tailed data (we require two moments). Risk underestimation is still present in the elliptical case and more pronounced than in the Gaussian case.

We discuss properties of the nonparametric and parametric bootstrap in this context. We show several results, including the interesting fact that standard applications of the bootstrap generally yield inconsistent estimates of bias.

We propose some strategies to correct these problems and practically validate them in some simulations. Throughout this paper, we will assume that p, n and np tend to infinity, and p<n.

Finally, we extend our study to the case of problems with more general linear constraints, including, in particular, inequality constraints.


Download Citation

Noureddine El Karoui. "High-dimensionality effects in the Markowitz problem and other quadratic programs with linear constraints: Risk underestimation." Ann. Statist. 38 (6) 3487 - 3566, December 2010.


Published: December 2010
First available in Project Euclid: 30 November 2010

zbMATH: 1274.62365
MathSciNet: MR2766860
Digital Object Identifier: 10.1214/10-AOS795

Primary: 62H10
Secondary: 90C20

Keywords: concentration of measure , Convex optimization , Covariance matrices , elliptical distributions , high-dimensional inference , Markowitz problem , multivariate statistical analysis , quadratic programs , Random matrix theory , Wishart matrices

Rights: Copyright © 2010 Institute of Mathematical Statistics

Vol.38 • No. 6 • December 2010
Back to Top