The Annals of Statistics

Vast volatility matrix estimation for high-frequency financial data

Yazhen Wang and Jian Zou

Full-text: Open access


High-frequency data observed on the prices of financial assets are commonly modeled by diffusion processes with micro-structure noise, and realized volatility-based methods are often used to estimate integrated volatility. For problems involving a large number of assets, the estimation objects we face are volatility matrices of large size. The existing volatility estimators work well for a small number of assets but perform poorly when the number of assets is very large. In fact, they are inconsistent when both the number, p, of the assets and the average sample size, n, of the price data on the p assets go to infinity. This paper proposes a new type of estimators for the integrated volatility matrix and establishes asymptotic theory for the proposed estimators in the framework that allows both n and p to approach to infinity. The theory shows that the proposed estimators achieve high convergence rates under a sparsity assumption on the integrated volatility matrix. The numerical studies demonstrate that the proposed estimators perform well for large p and complex price and volatility models. The proposed method is applied to real high-frequency financial data.

Article information

Ann. Statist., Volume 38, Number 2 (2010), 943-978.

First available in Project Euclid: 19 February 2010

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62H12: Estimation
Secondary: 62G05: Estimation 62M05: Markov processes: estimation 62P20: Applications to economics [See also 91Bxx]

Convergence rate diffusion integrated volatility matrix norm micro-structure noise realized volatility regularization sparsity threshold


Wang, Yazhen; Zou, Jian. Vast volatility matrix estimation for high-frequency financial data. Ann. Statist. 38 (2010), no. 2, 943--978. doi:10.1214/09-AOS730.

Export citation


  • Andersen, T. G., Bollerslev, T. and Diebold, F. X. (2004). Some like it smooth, and some like it rough: Untangling continuous and jump components in measuring, modeling, and forecasting asset return volatility. Unpublished manuscript.
  • Andersen, T. G., Bollerslev, T., Diebold, F. X. and Labys, P. (2003). Modeling and forecasting realized volatility. Econometrica 71 579–625.
  • Barndorff-Nielsen, O. E. and Shephard, N. (2002). Econometric analysis of realized volatility and its use in estimating stochastic volatility models. J. Roy. Statist. Soc. Ser. B 64 253–280.
  • Barndorff-Nielsen, O. E. and Shephard, N. (2004). Econometric analysis of realized covariance: High frequency based covariance, regression and correlation in financial economics. Econometrica 72 885–925.
  • Barndorff-Nielsen, O. E. and Shephard, N. (2006). Econometrics of testing for jumps in financial econometrics using bipower variation. Journal of Financial Econometrics 4 1–30.
  • Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A. and Shephard, N. (2008a). Designing realised kernels to measure the ex-post variation of equity prices in the presence of noise. Econometrica 76 1481–1536.
  • Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A. and Shephard, N. (2008b). Multivariate realised kernels: Consistent positive semi-definite estimators of the covariation of equity prices with noise and non-synchronous trading. Preprint.
  • Bickel, P. J. and Levina, E. (2008a). Regularized estimation of large covariance matrices. Ann. Statist. 36 199–227.
  • Bickel, P. J. and Levina, E. (2008b). Covariance regularization by thresholding. Ann. Statist. 36 2577–2604.
  • Cai, T., Zhang, C.-H. and Zhou, H. (2008). Optimal rates of convergence for covariance matrix estimation. Unpublished manuscript.
  • Chow, Y. S. and Teicher, H. (1997). Probability Theory: Independence, Interchangeability, Martingales, 3rd ed. Springer, New York.
  • Cox, J. C., Ingersoll, J. E. and Ross, S. A. (1985). A theory of the term structure of interest rates. Econometrica 53 385–407.
  • El Karoui, N. (2007). Tracy–Widom limit for the largest eigenvalue of a large class of complex sample covariance matrices. Ann. Probab. 35 663–714.
  • El Karoui, N. (2008). Operator norm consistent estimation of large dimensional sparse covariance matrices. Ann. Statist. 36 2717–2756.
  • Fan, J. and Wang, Y. (2007). Multi-scale jump and volatility analysis for high-frequency financial data. J. Amer. Statist. Assoc. 102 1349–1362.
  • Hansen, P. R. and Lunde, A. (2006). Realized variance and market microstructure noise (with discussions). J. Bus. Econ. Statist. 24 127–218.
  • Hayashi, T. and Yoshida, N. (2005). On covariance estimation of non-synchronously observed diffusion processes. Bernoulli 11 359–379.
  • He, S. W., Wang, J. G. and Yan, J. A. (1992). Semimartingale Theory and Stochastic Calculus. Science Press and CRC Press Inc., Beijing.
  • Huang, X. and Tauchen, G. (2005). The relative contribution of jumps to total price variance. Journal of Financial Econometrics 3 456–499.
  • Jacod, J. and Shiryaev, A. N. (2003). Limit Theorems for Stochastic Processes, 2nd ed. Springer, New York.
  • Jacod, J., Li, Y., Mykland, P. A., Podolskij, M. and Vetter, M. (2007). Micro-structure noise in the continuous case: The Pre-Averaging Approach. Unpublished manuscript.
  • Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal component analysis. Ann. Statist. 29 295–327.
  • Johnstone, I. M. and Lu, A. Y. (2009). On consistency and sparsity for principal components analysis in high dimensions (with discusions). J. Amer. Statist. Assoc. 104 682–703.
  • Kalnina, I. and Linton, O. (2008). Estimating quadratic variation consistently in the presence of correlated measurement error. J. Econometrics. 147 47–59.
  • Mancino, M. E. and Sanfelici, S. (2008). Robustness of Fourier estimator of integrated volatility in the presence of micro-structure noise. Comput. Statist. Data Anal. 52 2966–2989.
  • Wang, Y. (2002). Asymptotic nonequivalence of ARCH models and diffusions. Ann. Statist. 30 754–783.
  • Wang, Y. (2006). Selected review on wavelets. In Frontier Statistics, Festschrift for Peter Bickel (H. Koul and J. Fan, eds.) 163–179. Imp. Coll. Press, London.
  • Wang, Y., Yao, Q. and Zou, J. (2008). High dimensional volatility modeling and analysis for high-frequency financial data. Preprint.
  • Zhang, L., Mykland, P. A. and Aït-Sahalia, Y. (2005). A tale of two time scales: Determining integrated volatility with noisy high-frequency data. J. Amer. Statist. Assoc. 100 1394–1411.
  • Zhang, L. (2006). Efficient estimation of stochastic volatility using noisy observations: A multi-scale approach. Bernoulli 12 1019–1043.
  • Zhang, L. (2007). Estimating covariation: Epps effect, microstructure noise. Preprint.