The Annals of Statistics

Minimax estimation in sparse canonical correlation analysis

Chao Gao, Zongming Ma, Zhao Ren, and Harrison H. Zhou

Canonical correlation analysis is a widely used multivariate statistical technique for exploring the relation between two sets of variables. This paper considers the problem of estimating the leading canonical correlation directions in high-dimensional settings. Recently, under the assumption that the leading canonical correlation directions are sparse, various procedures have been proposed for many high-dimensional applications involving massive data sets. However, there has been few theoretical justification available in the literature. In this paper, we establish rate-optimal nonasymptotic minimax estimation with respect to an appropriate loss function for a wide range of model spaces. Two interesting phenomena are observed. First, the minimax rates are not affected by the presence of nuisance parameters, namely the covariance matrices of the two sets of random variables, though they need to be estimated in the canonical correlation analysis problem. Second, we allow the presence of the residual canonical correlation directions. However, they do not influence the minimax rates under a mild condition on eigengap. A generalized sin-theta theorem and an empirical process bound for Gaussian quadratic forms under rank constraint are used to establish the minimax upper bounds, which may be of independent interest.

Article information

Ann. Statist., Volume 43, Number 5 (2015), 2168-2197.

Received: May 2014
Revised: February 2015
First available in Project Euclid: 16 September 2015

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62H12: Estimation
Secondary: 62C20: Minimax procedures

Covariance matrix minimax rates model selection nuisance parameter sin-theta theorem sparse CCA (SCCA)


Gao, Chao; Ma, Zongming; Ren, Zhao; Zhou, Harrison H. Minimax estimation in sparse canonical correlation analysis. Ann. Statist. 43 (2015), no. 5, 2168--2197. doi:10.1214/15-AOS1332.

Supplemental materials

  • Supplement to “Minimax estimation in sparse canonical correlation analysis”. The supplement [15] contains an Appendix to the current paper in which we prove Theorems 3–5 and Lemmas 3 and 7–11.