Electronic Journal of Statistics

On the dimension effect of regularized linear discriminant analysis

Cheng Wang and Binyan Jiang

Full-text: Open access

Abstract

This paper studies the dimension effect of the linear discriminant analysis (LDA) and the regularized linear discriminant analysis (RLDA) classifiers for large dimensional data where the observation dimension $p$ is of the same order as the sample size $n$. More specifically, built on properties of the Wishart distribution and recent results in random matrix theory, we derive explicit expressions for the asymptotic misclassification errors of LDA and RLDA respectively, from which we gain insights of how dimension affects the performance of classification and in what sense. Motivated by these results, we propose adjusted classifiers by correcting the bias brought by the unequal sample sizes. The bias-corrected LDA and RLDA classifiers are shown to have smaller misclassification rates than LDA and RLDA respectively. Several interesting examples are discussed in detail and the theoretical results on dimension effect are illustrated via extensive simulation studies.

Article information

Source
Electron. J. Statist., Volume 12, Number 2 (2018), 2709-2742.

Dates
Received: May 2018
First available in Project Euclid: 15 September 2018

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1536976838

Digital Object Identifier
doi:10.1214/18-EJS1469

Keywords
Dimension effect linear discriminant analysis random matrix theory regularized linear discriminant analysis

Rights
Creative Commons Attribution 4.0 International License.

Citation

Wang, Cheng; Jiang, Binyan. On the dimension effect of regularized linear discriminant analysis. Electron. J. Statist. 12 (2018), no. 2, 2709--2742. doi:10.1214/18-EJS1469. https://projecteuclid.org/euclid.ejs/1536976838


Export citation

References

  • Anderson, T. (2003)., An introduction to multivariate statistical analysis. Wiley Series in Probability and Statistics.
  • Aoshima, M. and Yata, K. (2014). A distance-based, misclassification rate adjusted classifier for multiclass, high-dimensional data., Annals of the Institute of Statistical Mathematics 66 983–1010.
  • Bai, Z., Liu, H. and Wong, W.-K. (2009). Enhancement of the applicability of Markowitz’s portfolio optimization by utilizing random matrix theory., Mathematical Finance 19 639–667.
  • Bai, Z., Liu, H. and Wong, W. (2011). Asymptotic properties of eigenmatrices of a large sample covariance matrix., The Annals of Applied Probability 21 1994–2015.
  • Bai, Z., Miao, B. and Pan, G. (2007). On asymptotics of eigenvectors of large sample covariance matrix., Annals of Probability 35 1532–1572.
  • Bai, Z. and Saranadasa, H. (1996). Effect of high dimension: by an example of a two sample problem., Statistica Sinica 2 311–329.
  • Bai, Z. and Silverstein, J. W. (2010)., Spectral analysis of large dimensional random matrices. Springer.
  • Bickel, P. J. and Levina, E. (2004). Some theory for Fisher’s linear discriminant function,naive Bayes’, and some alternatives when there are many more variables than observations., Bernoulli 10 989–1010.
  • Bühlmann, P. (2013). Statistical significance in high-dimensional linear models., Bernoulli 19 1212–1242.
  • Cai, T. and Liu, W. (2011). A direct estimation approach to sparse linear discriminant analysis., Journal of the American Statistical Association 106 1566–1577.
  • Cai, T., Liu, W. and Luo, X. (2011). A constrained $\ell _1$ minimization approach to sparse precision matrix estimation., Journal of the American Statistical Association 106 594–607.
  • Cai, T., Liu, W. and Xia, Y. (2014). Two-sample test of high dimensional means under dependence., Journal of the Royal Statistical Society, Series B 76 349–372.
  • Chan, Y.-B. and Hall, P. (2009). Scale adjustments for classifiers in high-dimensional, low sample size settings., Biometrika 96 469–478.
  • Chen, L. S., Paul, D., Prentice, R. L. and Wang, P. (2011). A regularized Hotelling’s $T^2$ test for pathway analysis in proteomic studies., Journal of the American Statistical Association 106.
  • Cheng, Y. (2004). Asymptotic probabilities of misclassification of two discriminant functions in cases of high dimensional data., Statistics & Probability Letters 67 9–17.
  • Collins, B. and Śniady, P. (2006). Integration with respect to the Haar measure on unitary, orthogonal and symplectic group., Communications in Mathematical Physics 264 773–795.
  • Cook, R. D. and Forzani, L. (2011). On the mean and variance of the generalized inverse of a singular Wishart matrix., Electronic Journal of Statistics 5 146–158.
  • Dobriban, E. and Wager, S. (2018). High-dimensional asymptotics of prediction: Ridge regression and classification., Annals of Statistics 46 247–279.
  • Dudoit, S., Fridlyand, J. and Speed, T. P. (2002). Comparison of discrimination methods for the classification of tumors using gene expression data., Journal of the American Statistical Association 97 77–87.
  • El Karoui, N. (2008). Spectrum estimation for large dimensional covariance matrices using random matrix theory., Annals of Statistics 36 2757–2790.
  • El Karoui, N. (2010). High-dimensionality effects in the Markowitz problem and other quadratic programs with linear constraints: Risk underestimation., Annals of Statistics 38 3487–3566.
  • El Karoui, N. and Holger, K. (2011). Geometric sensitivity of random matrix results: consequences for shrinkage estimators of covariance and related statistical methods., arXiv:1105.1404.
  • Fan, J., Feng, Y. and Tong, X. (2012). A road to classification in high dimensional space: the regularized optimal affine discriminant., Journal of the Royal Statistical Society, Series B 74 745–771.
  • Friedman, J. H. (1989). Regularized discriminant analysis., Journal of the American Statistical Association 84 165–175.
  • Guo, Y., Hastie, T. and Tibshirani, R. (2007). Regularized linear discriminant analysis and its application in microarrays., Biostatistics 8 86–100.
  • Hand, D. (2006). Classifier technology and the illusion of progress., Statistical Science 21 1–14.
  • Huang, S., Tong, T. and Zhao, H. (2010). Bias-Corrected Diagonal Discriminant Rules for High-Dimensional Classification., Biometrics 66 1096–1106.
  • Jiang, T. and Yang, F. (2013). Central limit theorems for classical likelihood ratio tests for high-dimensional normal distributions., Annals of Statistics 41 2029–2074.
  • Kubokawa, T. and Srivastava, M. S. (2008). Estimation of the precision matrix of a singular Wishart distribution and its application in high-dimensional data., Journal of Multivariate Analysis 99 1906–1928.
  • Ledoit, O. and Péché, S. (2011). Eigenvectors of some large sample covariance matrix ensembles., Probability Theory and Related Fields 151 233–264.
  • Ledoit, O. and Wolf, M. (2004). Honey, I shrunk the sample covariance matrix., The Journal of Portfolio Management 30 110–119.
  • Li, Z. and Yao, J. (2016). On two simple and effective procedures for high dimensional classification of general populations., Statistical Papers 57 381–405.
  • Mai, Q., Zou, H. and Yuan, M. (2012). A direct approach to sparse discriminant analysis in ultra-high dimensions., Biometrika 99 29–42.
  • Marčenko, V. A. and Pastur, L. A. (1967). Distribution of eigenvalues for some sets of random matrices., Mathematics of the USSR-Sbornik 1 457.
  • Matsumoto, S. (2012). General moments of the inverse real Wishart distribution and orthogonal Weingarten functions., Journal of Theoretical Probability 25 798–822.
  • Moran, M. and Murphy, B. (1979). A closer look at two alternative methods of statistical discrimination., Applied Statistics 3 223–232.
  • Pan, G. and Zhou, W. (2011). Central limit theorem for Hotelling’s $T^2$ statistic under large dimension., The Annals of Applied Probability 1860–1910.
  • Saranadasa, H. (1993). Asymptotic expansion of the misclassification probabilities of D-and A-criteria for discrimination from two high dimensional populations using the theory of large dimensional random matrices., Journal of Multivariate Analysis 46 154–174.
  • Shao, J., Wang, Y., Deng, X. and Wang, S. (2011). Sparse linear discriminant analysis by thresholding for high dimensional data., Annals of Statistics 39 1241–1265.
  • von Rosen, D. (1988). Moments for the inverted Wishart distribution., Scandinavian Journal of Statistics 97–109.
  • Wang, X. and Leng, C. (2016). High dimensional ordinary least squares projection for screening variables., Journal of the Royal Statistical Society, Series B 78 589–611.
  • Wang, C., Pan, G., Tong, T. and Zhu, L. (2015). Shrinkage estimation of large dimensional precision matrix using random matrix theory., Statistica Sinica 25 993–1008.
  • Zollanvari, A. and Dougherty, E. R. (2013). Application of double asymptotics and random matrix theory in error estimation of regularized linear discriminant analysis. In, Global Conference on Signal and Information Processing (GlobalSIP), 2013 IEEE 57–59. IEEE.
  • Zollanvari, A. and Dougherty, E. R. (2015). Generalized consistent error estimator of linear discriminant analysis., IEEE Transactions on Signal Processing 63 2804–2814.