The Annals of Statistics
- Ann. Statist.
- Volume 45, Number 5 (2017), 2074-2101.
Sparse CCA: Adaptive estimation and computational barriers
Canonical correlation analysis is a classical technique for exploring the relationship between two sets of variables. It has important applications in analyzing high dimensional datasets originated from genomics, imaging and other fields. This paper considers adaptive minimax and computationally tractable estimation of leading sparse canonical coefficient vectors in high dimensions. Under a Gaussian canonical pair model, we first establish separate minimax estimation rates for canonical coefficient vectors of each set of random variables under no structural assumption on marginal covariance matrices. Second, we propose a computationally feasible estimator to attain the optimal rates adaptively under an additional sample size condition. Finally, we show that a sample size condition of this kind is needed for any randomized polynomial-time estimator to be consistent, assuming hardness of certain instances of the planted clique detection problem. As a byproduct, we obtain the first computational lower bounds for sparse PCA under the Gaussian single spiked covariance model.
Ann. Statist., Volume 45, Number 5 (2017), 2074-2101.
Received: August 2015
Revised: September 2016
First available in Project Euclid: 31 October 2017
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Gao, Chao; Ma, Zongming; Zhou, Harrison H. Sparse CCA: Adaptive estimation and computational barriers. Ann. Statist. 45 (2017), no. 5, 2074--2101. doi:10.1214/16-AOS1519. https://projecteuclid.org/euclid.aos/1509436828
- Supplement to “Sparse CCA: Adaptive estimation and computational barriers”. The supplement presents additional proofs and technical details, implementation detail of (18), and numerical studies.