The Annals of Statistics

On the inference about the spectral distribution of high-dimensional covariance matrix based on high-frequency noisy observations

Ningning Xia and Xinghua Zheng

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


In practice, observations are often contaminated by noise, making the resulting sample covariance matrix a signal-plus-noise sample covariance matrix. Aiming to make inferences about the spectral distribution of the population covariance matrix under such a situation, we establish an asymptotic relationship that describes how the limiting spectral distribution of (signal) sample covariance matrices depends on that of signal-plus-noise-type sample covariance matrices. As an application, we consider inferences about the spectral distribution of integrated covolatility (ICV) matrices of high-dimensional diffusion processes based on high-frequency data with microstructure noise. The (slightly modified) pre-averaging estimator is a signal-plus-noise sample covariance matrix, and the aforementioned result, together with a (generalized) connection between the spectral distribution of signal sample covariance matrices and that of the population covariance matrix, enables us to propose a two-step procedure to consistently estimate the spectral distribution of ICV for a class of diffusion processes. An alternative approach is further proposed, which possesses several desirable properties: it is more robust, it eliminates the effects of microstructure noise, and the asymptotic relationship that enables consistent estimation of the spectral distribution of ICV is the standard Marčenko–Pastur equation. The performance of the two approaches is examined via simulation studies under both synchronous and asynchronous observation settings.

Article information

Ann. Statist., Volume 46, Number 2 (2018), 500-525.

Received: August 2015
Revised: January 2017
First available in Project Euclid: 3 April 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62H12: Estimation
Secondary: 62G99: None of the above, but in this section 60F15: Strong theorems

High-dimension high-frequency integrated covariance matrices Marčenko–Pastur equation microstructure noise


Xia, Ningning; Zheng, Xinghua. On the inference about the spectral distribution of high-dimensional covariance matrix based on high-frequency noisy observations. Ann. Statist. 46 (2018), no. 2, 500--525. doi:10.1214/17-AOS1558.

Export citation


  • Aït-Sahalia, Y., Fan, J. and Li, Y. (2010). The leverage effect puzzle: Disentangling sources of bias at high frequency. J. Financ. Econ. 109 224–249.
  • Aït-Sahalia, Y., Fan, J. and Xiu, D. (2010). High-frequency covariance estimates with noisy and asynchronous financial data. J. Amer. Statist. Assoc. 105 1504–1517.
  • Andersen, T. G. and Bollerslev, T. (1998). Answering the skeptics: Yes, standard volatility models do provide accurate forecasts. Internat. Econom. Rev. 39 885–905.
  • Andersen, T. G., Bollerslev, T., Diebold, F. X. and Labys, P. (2001). The distribution of realized exchange rate volatility. J. Amer. Statist. Assoc. 96 42–55.
  • Bai, Z., Chen, J. and Yao, J. (2010). On estimation of the population spectral distribution from a high-dimensional sample covariance matrix. Aust. N. Z. J. Stat. 52 423–437.
  • Bai, Z. and Silverstein, J. W. (2010). Spectral Analysis of Large Dimensional Random Matrices, 2nd ed. Springer, New York.
  • Barndorff-Nielsen, O. E. and Shephard, N. (2002). Econometric analysis of realized volatility and its use in estimating stochastic volatility models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 64 253–280.
  • Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A. and Shephard, N. (2008). Designing realized kernels to measure the ex post variation of equity prices in the presence of noise. Econometrica 76 1481–1536.
  • Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A. and Shephard, N. (2011). Multivariate realised kernels: Consistent positive semi-definite estimators of the covariation of equity prices with noise and non-synchronous trading. J. Econometrics 162 149–169.
  • Christensen, K., Kinnebrock, S. and Podolskij, M. (2010). Pre-averaging estimators of the ex-post covariance matrix in noisy diffusion models with non-synchronous data. J. Econometrics 159 116–133.
  • Dozier, R. B. and Silverstein, J. W. (2007a). On the empirical distribution of eigenvalues of large dimensional information-plus-noise-type matrices. J. Multivariate Anal. 98 678–694.
  • El Karoui, N. (2008). Spectrum estimation for large dimensional covariance matrices using random matrix theory. Ann. Statist. 36 2757–2790.
  • El Karoui, N. (2009). Concentration of measure and spectra of random matrices: Applications to correlation matrices, elliptical distributions and beyond. Ann. Appl. Probab. 19 2362–2405.
  • El Karoui, N. (2010a). High-dimensionality effects in the Markowitz problem and other quadratic programs with linear constraints: Risk underestimation. Ann. Statist. 38 3487–3566.
  • El Karoui, N. (2010b). On information plus noise kernel random matrices. Ann. Statist. 38 3191–3216.
  • El Karoui, N. (2013). On the realized risk of high-dimensional Markowitz portfolios. SIAM J. Financial Math. 4 737–783.
  • Gloter, A. and Jacod, J. (2001). Diffusions with measurement errors. II. Optimal estimators. ESAIM Probab. Stat. 5 243–260.
  • Hachem, W., Loubaton, P., Mestre, X., Najim, J. and Vallet, P. (2012). Large information plus noise random matrix models and consistent subspace estimation in large sensor networks. Random Matrices Theory Appl. 1 1150006, 51.
  • Hansen, P. R. and Lunde, A. (2006). Realized variance and market microstructure noise. J. Bus. Econom. Statist. 24 127–218.
  • Jacod, J., Li, Y. and Zheng, X. (2017). Statistical properties of microstructure noise. Econometrica. To appear. Available at SSRN,
  • Jacod, J. and Protter, P. (1998). Asymptotic error distributions for the Euler method for stochastic differential equations. Ann. Probab. 26 267–307.
  • Jacod, J., Li, Y., Mykland, P. A., Podolskij, M. and Vetter, M. (2009). Microstructure noise in the continuous case: The pre-averaging approach. Stochastic Process. Appl. 119 2249–2276.
  • Ledoit, O. and Wolf, M. (2015). Spectrum estimation: A unified framework for covariance matrix estimation and PCA in large dimensions. J. Multivariate Anal. 139 360–384.
  • Liu, L. Y., Patton, A. J. and Sheppard, K. (2015). Does anything beat 5-minute RV? A comparison of realized measures across multiple asset classes. J. Econometrics 187 293–311.
  • Marčenko, V. A. and Pastur, L. A. (1967). Distribution of eigenvalues in certain sets of random matrices. Mat. Sb. 72 507–536.
  • McNeil, A. J., Frey, R. and Embrechts, P. (2005). Quantitative Risk Management: Concepts, Techniques and Tools. Princeton Univ. Press, Princeton, NJ.
  • Mestre, X. (2008). Improved estimation of eigenvalues and eigenvectors of covariance matrices using their sample estimates. IEEE Trans. Inform. Theory 54 5113–5129.
  • Mykland, P. A. and Zhang, L. (2006). ANOVA for diffusions and Itô processes. Ann. Statist. 34 1931–1963.
  • Podolskij, M. and Vetter, M. (2009). Estimation of volatility functionals in the simultaneous presence of microstructure noise and jumps. Bernoulli 15 634–658.
  • Ubukata, M. and Oya, K. (2009). Estimation and testing for dependence in market microstructure noise. J. Financ. Econom. 7 106–151.
  • Wang, C. D. and Mykland, P. A. (2014). The estimation of leverage effect with high-frequency data. J. Amer. Statist. Assoc. 109 197–215.
  • Xia, N. and Zheng, X. (2018). Supplement to “On the inference about the spectral distribution of high-dimensional covariance matrix based on high-frequency noisy observations.” DOI:10.1214/17-AOS1558SUPP.
  • Xiu, D. (2010). Quasi-maximum likelihood estimation of volatility with high frequency data. J. Econometrics 159 235–250.
  • Zhang, L. (2006). Efficient estimation of stochastic volatility using noisy observations: A multi-scale approach. Bernoulli 12 1019–1043.
  • Zhang, L. (2011). Estimating covariation: Epps effect, microstructure noise. J. Econometrics 160 33–47.
  • Zhang, L., Mykland, P. A. and Aït-Sahalia, Y. (2005). A tale of two time scales: Determining integrated volatility with noisy high-frequency data. J. Amer. Statist. Assoc. 100 1394–1411.
  • Zheng, X. and Li, Y. (2011). On the estimation of integrated covariance matrices of high dimensional diffusion processes. Ann. Statist. 39 3121–3151.

Supplemental materials

  • Supplement to “On the inference about the spectral distribution of high-dimensional covariance matrix based on high-frequency noisy observations”. Due to space constraints, the proofs of Theorems 2.1, 2.2 and 2.3 are given in the supplementary article Xia and Zheng (2018).