Electronic Journal of Statistics

Reconstruction of a high-dimensional low-rank matrix

Kazuyoshi Yata and Makoto Aoshima

Full-text: Open access

Abstract

We consider the problem of recovering a low-rank signal matrix in high-dimensional situations. The main issue is how to estimate the signal matrix in the presence of huge noise. We introduce the power spiked model to describe the structure of singular values of a huge data matrix. We first consider the conventional PCA to recover the signal matrix and show that the estimation of the signal matrix holds consistency properties under severe conditions. The conventional PCA is heavily subjected to the noise. In order to reduce the noise we apply the noise-reduction (NR) methodology and propose a new estimation of the signal matrix. We show that the proposed estimation by the NR method holds the consistency properties under mild conditions and improves the error rate of the conventional PCA effectively. Finally, we demonstrate the reconstruction procedures by using a microarray data set.

Article information

Source
Electron. J. Statist. Volume 10, Number 1 (2016), 895-917.

Dates
Received: October 2015
First available in Project Euclid: 8 April 2016

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1460141647

Digital Object Identifier
doi:10.1214/16-EJS1128

Mathematical Reviews number (MathSciNet)
MR3486420

Zentralblatt MATH identifier
1341.62170

Subjects
Primary: 62H25: Factor analysis and principal components; correspondence analysis
Secondary: 62F12: Asymptotic properties of estimators

Keywords
Eigenstructure HDLSS noise-reduction methodology PCA singular value decomposition

Citation

Yata, Kazuyoshi; Aoshima, Makoto. Reconstruction of a high-dimensional low-rank matrix. Electron. J. Statist. 10 (2016), no. 1, 895--917. doi:10.1214/16-EJS1128. https://projecteuclid.org/euclid.ejs/1460141647


Export citation

References

  • [1] Bhattacharjee, A., Richards, W. G., Staunton, J., Li, C., Monti, S., Vasa, P., Ladd, C., Beheshti, J., Bueno, R., Gillette, M., Loda, M., Weber, G., Mark, E. J., Lander, E. S., Wong, W., Johnson, B. E., Golub, T. R., Sugarbaker, D. J. and Meyerson, M. (2001). Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses., Proc. Natl. Acad. Sci. USA 98 13790–13795.
  • [2] Ishii, A., Yata, K. and Aoshima, M. (2016). Asymptotic properties of the first principal component and equality tests of covariance matrices in high-dimension, low-sample-size context., J. Statist. Plann. Inference 170 186–199.
  • [3] Jung, S. and Marron, J. S. (2009). PCA consistency in high dimension, low sample size context., Ann. Statist. 37 4104–4130.
  • [4] Murayama, W., Yata, K. and Aoshima, M. (2015). Reconstruction of a signal matrix for high-dimension, low-sample-size data., RIMS Koukyuroku 1954 23–31.
  • [5] Negahban, S. and Wainwright, M. J. (2011). Estimation of (near) low-rank matrices with noise and high-dimensional scaling., Ann. Statist. 39 1069–1097.
  • [6] Rohde, A. and Tsybakov, A. (2011). Estimation of high-dimensional low-rank matrices., Ann. Statist. 39 887–930.
  • [7] Shabalin, A. and Nobel, A. (2013). Reconstruction of a low-rank matrix in the presence of Gaussian noise., J. Multivariate Anal. 118 67–76.
  • [8] Shen, D., Shen, H. and Marron, J. S. (2013). Consistency of sparse PCA in high dimension, low sample size contexts., J. Multivariate Anal. 115 317–333.
  • [9] Yang, K., Cai, Z., Li, J. and Lin, G. (2006). A stable gene selection in microarray data analysis., BMC Bioinformatics 7 228.
  • [10] Yata, K. and Aoshima, M. (2009). PCA consistency for non-Gaussian data in high dimension, low sample size context., Comm. Statist. Theory Methods, Special Issue Honoring Zacks, S. (ed. Mukhopadhyay, N.) 38 2634–2652.
  • [11] Yata, K. and Aoshima, M. (2012). Effective PCA for high-dimension, low-sample-size data with noise reduction via geometric representations., J. Multivariate Anal. 105 193–215.
  • [12] Yata, K. and Aoshima, M. (2013). PCA consistency for the power spiked model in high-dimensional settings., J. Multivariate Anal. 122 334–354.
  • [13] Zhou, Y.-H. and Marron, J. S. (2015). High dimension low sample size asymptotics of robust PCA., Electron. J. Stat. 9 204–218.