The Annals of Statistics

Minimax estimation in sparse canonical correlation analysis

Chao Gao, Zongming Ma, Zhao Ren, and Harrison H. Zhou

Full-text: Open access

Abstract

Canonical correlation analysis is a widely used multivariate statistical technique for exploring the relation between two sets of variables. This paper considers the problem of estimating the leading canonical correlation directions in high-dimensional settings. Recently, under the assumption that the leading canonical correlation directions are sparse, various procedures have been proposed for many high-dimensional applications involving massive data sets. However, there has been few theoretical justification available in the literature. In this paper, we establish rate-optimal nonasymptotic minimax estimation with respect to an appropriate loss function for a wide range of model spaces. Two interesting phenomena are observed. First, the minimax rates are not affected by the presence of nuisance parameters, namely the covariance matrices of the two sets of random variables, though they need to be estimated in the canonical correlation analysis problem. Second, we allow the presence of the residual canonical correlation directions. However, they do not influence the minimax rates under a mild condition on eigengap. A generalized sin-theta theorem and an empirical process bound for Gaussian quadratic forms under rank constraint are used to establish the minimax upper bounds, which may be of independent interest.

Article information

Source
Ann. Statist., Volume 43, Number 5 (2015), 2168-2197.

Dates
Received: May 2014
Revised: February 2015
First available in Project Euclid: 16 September 2015

Permanent link to this document
https://projecteuclid.org/euclid.aos/1442364149

Digital Object Identifier
doi:10.1214/15-AOS1332

Mathematical Reviews number (MathSciNet)
MR3396982

Zentralblatt MATH identifier
1327.62340

Subjects
Primary: 62H12: Estimation
Secondary: 62C20: Minimax procedures

Keywords
Covariance matrix minimax rates model selection nuisance parameter sin-theta theorem sparse CCA (SCCA)

Citation

Gao, Chao; Ma, Zongming; Ren, Zhao; Zhou, Harrison H. Minimax estimation in sparse canonical correlation analysis. Ann. Statist. 43 (2015), no. 5, 2168--2197. doi:10.1214/15-AOS1332. https://projecteuclid.org/euclid.aos/1442364149


Export citation

References

  • [1] Amini, A. A. and Wainwright, M. J. (2009). High-dimensional analysis of semidefinite relaxations for sparse principal components. Ann. Statist. 37 2877–2921.
  • [2] Anderson, T. W. (1999). Asymptotic theory for canonical correlation analysis. J. Multivariate Anal. 70 1–29.
  • [3] Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley, Hoboken, NJ.
  • [4] Avants, B. B., Cook, P. A., Ungar, L., Gee, J. C. and Grossman, M. (2010). Dementia induces correlated reductions in white matter integrity and cortical thickness: A multivariate neuroimaging study with sparse canonical correlation analysis. NeuroImage 50 1004–1016.
  • [5] Bao, Z., Hu, G., Pan, G. and Zhou, W. (2014). Canonical correlation coefficients of high-dimensional normal vectors: Finite rank case. Preprint. Available at arXiv:1407.7194.
  • [6] Berthet, Q. and Rigollet, P. (2013). Complexity theoretic lower bounds for sparse principal component detection. J. Mach. Learn. Res. 30 1–21.
  • [7] Bhatia, R. (1997). Matrix Analysis. Graduate Texts in Mathematics 169. Springer, New York.
  • [8] Birgé, L. (1983). Approximation dans les espaces métriques et théorie de l’estimation. Z. Wahrsch. Verw. Gebiete 65 181–237.
  • [9] Birnbaum, A., Johnstone, I. M., Nadler, B. and Paul, D. (2013). Minimax bounds for sparse PCA with noisy high-dimensional data. Ann. Statist. 41 1055–1084.
  • [10] Cai, T., Ma, Z. and Wu, Y. (2015). Optimal estimation and rank detection for sparse spiked covariance matrices. Probab. Theory Related Fields 161 781–815.
  • [11] Cai, T. T., Ma, Z. and Wu, Y. (2013). Sparse PCA: Optimal rates and adaptive estimation. Ann. Statist. 41 3074–3110.
  • [12] Cancer Genome Atlas Network (2012). Comprehensive molecular portraits of human breast tumours. Nature 490 61–70.
  • [13] Chen, M., Gao, C., Ren, Z. and Zhou, H. H. (2013). Sparse CCA via precision adjusted iterative thresholding. Preprint. Available at arXiv:1311.6186.
  • [14] Davis, C. and Kahan, W. M. (1970). The rotation of eigenvectors by a perturbation. III. SIAM J. Numer. Anal. 7 1–46.
  • [15] Gao, C., Ma, Z., Ren, Z. and Zhou, H. H. (2015). Supplement to “Minimax estimation in sparse canonical correlation analysis.” DOI:10.1214/15-AOS1332SUPP.
  • [16] Gao, C., Ma, Z. and Zhou, H. H. (2014). Sparse CCA: Adaptive estimation and computational barriers. Preprint. Available at arXiv:1409.8565.
  • [17] Gao, C. and Zhou, H. H. (2015). Rate-optimal posterior contraction for sparse PCA. Ann. Statist. 43 785–818.
  • [18] Golub, G. H. and Van Loan, C. F. (1996). Matrix Computations, 3rd ed. Johns Hopkins Univ. Press, Baltimore, MD.
  • [19] Hardoon, D. R. and Shawe-Taylor, J. (2011). Sparse canonical correlation analysis. Mach. Learn. 83 331–353.
  • [20] Hotelling, H. (1936). Relations between two sets of variates. Biometrika 28 321–377.
  • [21] Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295–327.
  • [22] Johnstone, I. M. (2008). Multivariate analysis and Jacobi ensembles: Largest eigenvalue, Tracy–Widom limits and rates of convergence. Ann. Statist. 36 2638–2716.
  • [23] Johnstone, I. M. and Lu, A. Y. (2009). On consistency and sparsity for principal components analysis in high dimensions. J. Amer. Statist. Assoc. 104 682–693.
  • [24] LeCam, L. (1973). Convergence of estimates under dimensionality restrictions. Ann. Statist. 1 38–53.
  • [25] Lê Cao, K.-A., Martin, P. G. P., Robert-Granié, C. and Besse, P. (2009). Sparse canonical methods for biological data integration: Application to a cross-platform study. BMC Bioinformatics 10 1–34.
  • [26] Ma, Z. (2013). Sparse principal component analysis and iterative thresholding. Ann. Statist. 41 772–801.
  • [27] Ma, Z. and Wu, Y. (2013). Volume ratio, sparsity, and minimaxity under unitarily invariant norms. Preprint. Available at arXiv:1306.3609.
  • [28] Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). Multivariate Analysis. Academic Press, London.
  • [29] Parkhomenko, E., Tritchler, D. and Beyene, J. (2009). Sparse canonical correlation analysis with application to genomic data integration. Stat. Appl. Genet. Mol. Biol. 8 Art. 1, 36.
  • [30] Stewart, G. W. and Sun, J. G. (1990). Matrix Perturbation Theory. Academic Press, Boston, MA.
  • [31] Tao, T. (2012). Topics in Random Matrix Theory. Graduate Studies in Mathematics 132. Amer. Math. Soc., Providence, RI.
  • [32] Tipping, M. E. and Bishop, C. M. (1999). Probabilistic principal component analysis. J. R. Stat. Soc. Ser. B. Stat. Methodol. 61 611–622.
  • [33] Vu, V. Q. and Lei, J. (2013). Minimax sparse principal subspace estimation in high dimensions. Ann. Statist. 41 2905–2947.
  • [34] Waaijenborg, S. and Zwinderman, A. H. (2009). Sparse canonical correlation analysis for identifying, connecting and completing gene-expression networks. BMC Bioinformatics 10 315.
  • [35] Wang, T., Berthet, Q. and Samworth, R. J. (2014). Statistical and computational trade-offs in estimation of sparse principal components. Preprint. Available at arXiv:1408.5369.
  • [36] Wang, Y. X. R., Jiang, K., Feldman, L. J., Bickel, P. J. and Huang, H. (2014). Inferring gene association networks using sparse canonical correlation analysis. Preprint. Available at arXiv:1401.6504.
  • [37] Wedin, P.-Ȧ. (1972). Perturbation bounds in connection with singular value decomposition. BIT 12 99–111.
  • [38] Wiesel, A., Kliger, M. and Hero, A. O. III (2008). A greedy approach to sparse canonical correlation analysis. Preprint. Available at arXiv:0801.2748.
  • [39] Witten, D. M., Tibshirani, R. and Hastie, T. (2009). A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10 515–534.
  • [40] Yang, D., Ma, Z. and Buja, A. (2011). A sparse SVD method for high-dimensional data. Preprint. Available at arXiv:1112.2433.
  • [41] Yang, Y. and Barron, A. (1999). Information-theoretic determination of minimax rates of convergence. Ann. Statist. 27 1564–1599.

Supplemental materials

  • Supplement to “Minimax estimation in sparse canonical correlation analysis”. The supplement [15] contains an Appendix to the current paper in which we prove Theorems 3–5 and Lemmas 3 and 7–11.