• Bernoulli
  • Volume 24, Number 4B (2018), 3657-3682.

Large volatility matrix estimation with factor-based diffusion model for high-frequency financial data

Donggyu Kim, Yi Liu, and Yazhen Wang

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


Large volatility matrices are involved in many finance practices, and estimating large volatility matrices based on high-frequency financial data encounters the “curse of dimensionality”. It is a common approach to impose a sparsity assumption on the large volatility matrices to produce consistent volatility matrix estimators. However, due to the existence of common factors, assets are highly correlated with each other, and it is not reasonable to assume the volatility matrices are sparse in financial applications. This paper incorporates factor influence in the asset pricing model and investigates large volatility matrix estimation under the factor price model together with some sparsity assumption. We propose to model asset prices by assuming that asset prices are governed by common factors and that the assets with similar characteristics share the same association with the factors. We then impose some reasonable sparsity condition on the part of the volatility matrices after accounting for the factor contribution. Under the proposed factor-based model and sparsity assumption, we develop an estimation scheme called “blocking and regularizing”. Asymptotic properties of the proposed estimator are studied, and its finite sample performance is tested via extensive numerical studies to support theoretical results.

Article information

Bernoulli, Volume 24, Number 4B (2018), 3657-3682.

Received: October 2016
Revised: July 2017
First available in Project Euclid: 18 April 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

adaptive threshold diffusion factor model integrated volatility kernel realized volatility multiple-scale realized volatility pre-averaging realized volatility regularization sparsity


Kim, Donggyu; Liu, Yi; Wang, Yazhen. Large volatility matrix estimation with factor-based diffusion model for high-frequency financial data. Bernoulli 24 (2018), no. 4B, 3657--3682. doi:10.3150/17-BEJ974.

Export citation


  • [1] Aguilar, O. and West, M. (2000). Bayesian dynamic factor models and portfolio allocation. J. Bus. Econom. Statist. 18 338–357.
  • [2] Aït-Sahalia, Y., Fan, J. and Xiu, D. (2010). High-frequency covariance estimates with noisy and asynchronous financial data. J. Amer. Statist. Assoc. 105 1504–1517.
  • [3] Aït-Sahalia, Y. and Xiu, D. (2015). Using principal component analysis to estimate a high dimensional factor model with high-frequency data. Chicago Booth Research Paper, 15–43.
  • [4] Andersen, T.G., Bollerslev, T., Diebold, F.X. and Labys, P. (2003). Modeling and forecasting realized volatility. Econometrica 71 579–625.
  • [5] Barndorff-Nielsen, O.E., Hansen, P.R., Lunde, A. and Shephard, N. (2008). Designing realized kernels to measure the ex post variation of equity prices in the presence of noise. Econometrica 76 1481–1536.
  • [6] Barndorff-Nielsen, O.E., Hansen, P.R., Lunde, A. and Shephard, N. (2011). Multivariate realised kernels: Consistent positive semi-definite estimators of the covariation of equity prices with noise and non-synchronous trading. J. Econometrics 162 149–169.
  • [7] Barndorff-Nielsen, O.E. and Shephard, N. (2002). Econometric analysis of realized volatility and its use in estimating stochastic volatility models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 64 253–280.
  • [8] Barndorff-Nielsen, O.E. and Shephard, N. (2006). Econometrics of testing for jumps in financial econometrics using bipower variation. J. Financ. Econom. 4 1–30.
  • [9] Bibinger, M., Hautsch, N., Malec, P. and Reiss, M. (2014). Estimating the quadratic covariation matrix from noisy observations: Local method of moments and efficiency. Ann. Statist. 42 80–114.
  • [10] Bickel, P.J. and Levina, E. (2008). Regularized estimation of large covariance matrices. Ann. Statist. 36 199–227.
  • [11] Bickel, P.J. and Levina, E. (2008). Covariance regularization by thresholding. Ann. Statist. 36 2577–2604.
  • [12] Cai, T. and Liu, W. (2011). Adaptive thresholding for sparse covariance matrix estimation. J. Amer. Statist. Assoc. 106 672–684.
  • [13] Cai, T.T. and Zhou, H.H. (2012). Optimal rates of convergence for sparse covariance matrix estimation. Ann. Statist. 40 2389–2420.
  • [14] Chamberlain, G. (1983). Funds, factors, and diversification in arbitrage pricing models. Econometrica 51 1305–1323.
  • [15] Chamberlain, G. and Rothschild, M. (1983). Arbitrage, factor structure, and mean-variance analysis on large asset markets. Econometrica 51 1281–1304.
  • [16] Christensen, K., Kinnebrock, S. and Podolskij, M. (2010). Pre-averaging estimators of the ex-post covariance matrix in noisy diffusion models with non-synchronous data. J. Econometrics 159 116–133.
  • [17] Christensen, K., Podolskij, M. and Vetter, M. (2013). On covariation estimation for multivariate continuous Itô semimartingales with noise in non-synchronous observation schemes. J. Multivariate Anal. 120 59–84.
  • [18] Cox, J.C., Ingersoll, J.E. Jr. and Ross, S.A. (1985). A theory of the term structure of interest rates. Econometrica 53 385–407.
  • [19] Diebold, F.X. and Nerlove, M. (1989). The dynamics of exchange rate volatility: A multivariate latent factor ARCH model. J. Appl. Econometrics 4 1–21.
  • [20] Engle, R.F. and Watson, M.W. (1981). A one-factor multivariate time series model of metropolitan wage rates. J. Amer. Statist. Assoc. 76 774–781.
  • [21] Fama, E.F. and French, K.R. (1992). The cross-section of expected stock returns. J. Finance 47 427–465.
  • [22] Fama, E.F. and French, K.R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics 33 3–56.
  • [23] Fan, J., Fan, Y. and Lv, J. (2008). High dimensional covariance matrix estimation using a factor model. J. Econometrics 147 186–197.
  • [24] Fan, J., Furger, A. and Xiu, D. (2016). Incorporating global industrial classification standard into portfolio allocation: A simple factor-based large covariance matrix estimator with high-frequency data. J. Bus. Econom. Statist. 34 489–503.
  • [25] Fan, J., Liao, Y. and Mincheva, M. (2013). Large covariance estimation by thresholding principal orthogonal complements. J. R. Stat. Soc. Ser. B. Stat. Methodol. 75 603–680.
  • [26] Fan, J. and Wang, Y. (2007). Multi-scale jump and volatility analysis for high-frequency financial data. J. Amer. Statist. Assoc. 102 1349–1362.
  • [27] Hayashi, T. and Yoshida, N. (2005). On covariance estimation of non-synchronously observed diffusion processes. Bernoulli 11 359–379.
  • [28] Huang, X. and Tauchen, G. (2005). The relative contribution of jumps to total price variance. J. Financ. Econom. 3 456–499.
  • [29] Jacod, J., Li, Y., Mykland, P.A., Podolskij, M. and Vetter, M. (2009). Microstructure noise in the continuous case: The pre-averaging approach. Stochastic Process. Appl. 119 2249–2276.
  • [30] Kim, D., Kong, X., Li, C. and Wang, Y. (2017). Adaptive thresholding for large volatility matrix estimation based on high-frequency financial data. J. Econometrics. To appear.
  • [31] Kim, D. and Wang, Y. (2016). Unified discrete-time and continuous-time models and statistical inferences for merged low-frequency and high-frequency financial data. J. Econometrics 194 220–230.
  • [32] Kim, D., Wang, Y. and Zou, J. (2016). Asymptotic theory for large volatility matrix estimation based on high-frequency financial data. Stochastic Process. Appl. 126 3527–3577.
  • [33] Li, R.-C. (1998). Relative perturbation theory. I. Eigenvalue and singular value variations. SIAM J. Matrix Anal. Appl. 19 956–982.
  • [34] Li, R.-C. (1999). Relative perturbation theory. II. Eigenspace and singular subspace variations. SIAM J. Matrix Anal. Appl. 20 471–492.
  • [35] Mancino, M.E. and Sanfelici, S. (2008). Robustness of Fourier estimator of integrated volatility in the presence of microstructure noise. Comput. Statist. Data Anal. 52 2966–2989.
  • [36] Mancino, M.E. and Sanfelici, S. (2011). Estimating covariance via Fourier method in the presence of asynchronous trading and microstructure noise. J. Financ. Econom. 9 367–408.
  • [37] Ross, S. (1977). The capital asset pricing model (CAMP). Short-sale restrictions and related issues. J. Finance 32 177–183.
  • [38] Ross, S.A. (1976). The arbitrage theory of capital asset pricing. J. Econom. Theory 13 341–360.
  • [39] Stock, J.H. and Watson, M.W. (2005). Implications of dynamic factor models for VAR analysis. National Bureau of Economic Research. (No. w11467).
  • [40] Tao, M., Wang, Y. and Chen, X. (2013). Fast convergence rates in estimating large volatility matrices using high-frequency financial data. Econometric Theory 29 838–856.
  • [41] Tao, M., Wang, Y. and Zhou, H.H. (2013). Optimal sparse volatility matrix estimation for high-dimensional Itô processes with measurement errors. Ann. Statist. 41 1816–1864.
  • [42] Wang, Y. (2002). Asymptotic nonequivalence of Garch models and diffusions. Ann. Statist. 30 754–783.
  • [43] Wang, Y. and Zou, J. (2010). Vast volatility matrix estimation for high-frequency financial data. Ann. Statist. 38 943–978.
  • [44] Xiu, D. (2010). Quasi-maximum likelihood estimation of volatility with high frequency data. J. Econometrics 159 235–250.
  • [45] Zhang, L. (2006). Efficient estimation of stochastic volatility using noisy observations: A multi-scale approach. Bernoulli 12 1019–1043.
  • [46] Zhang, L. (2011). Estimating covariation: Epps effect, microstructure noise. J. Econometrics 160 33–47.
  • [47] Zhang, L., Mykland, P.A. and Aït-Sahalia, Y. (2005). A tale of two time scales: Determining integrated volatility with noisy high-frequency data. J. Amer. Statist. Assoc. 100 1394–1411.