A high-dimensional $r$-factor model for an $n$-dimensional vector time series is characterised by the presence of a large eigengap (increasing with $n$) between the $r$-th and the $(r+1)$-th largest eigenvalues of the covariance matrix. Consequently, Principal Component (PC) analysis is the most popular estimation method for factor models and its consistency, when $r$ is correctly estimated, is well-established in the literature. However, popular factor number estimators often suffer from the lack of an obvious eigengap in empirical eigenvalues and tend to over-estimate $r$ due, for example, to the existence of non-pervasive factors affecting only a subset of the series. We show that the errors in the PC estimators resulting from the over-estimation of $r$ are non-negligible, which in turn lead to the violation of the conditions required for factor-based large covariance estimation. To remedy this, we propose new estimators of the factor model based on scaling the entries of the sample eigenvectors. We show both theoretically and numerically that the proposed estimators successfully control for the over-estimation error, and investigate their performance when applied to risk minimisation of a portfolio of financial time series.
"Consistent estimation of high-dimensional factor models when the factor number is over-estimated." Electron. J. Statist. 14 (2) 2892 - 2921, 2020. https://doi.org/10.1214/20-EJS1741