Open Access
Translator Disclaimer
October 2015 Minimax estimation in sparse canonical correlation analysis
Chao Gao, Zongming Ma, Zhao Ren, Harrison H. Zhou
Ann. Statist. 43(5): 2168-2197 (October 2015). DOI: 10.1214/15-AOS1332

Abstract

Canonical correlation analysis is a widely used multivariate statistical technique for exploring the relation between two sets of variables. This paper considers the problem of estimating the leading canonical correlation directions in high-dimensional settings. Recently, under the assumption that the leading canonical correlation directions are sparse, various procedures have been proposed for many high-dimensional applications involving massive data sets. However, there has been few theoretical justification available in the literature. In this paper, we establish rate-optimal nonasymptotic minimax estimation with respect to an appropriate loss function for a wide range of model spaces. Two interesting phenomena are observed. First, the minimax rates are not affected by the presence of nuisance parameters, namely the covariance matrices of the two sets of random variables, though they need to be estimated in the canonical correlation analysis problem. Second, we allow the presence of the residual canonical correlation directions. However, they do not influence the minimax rates under a mild condition on eigengap. A generalized sin-theta theorem and an empirical process bound for Gaussian quadratic forms under rank constraint are used to establish the minimax upper bounds, which may be of independent interest.

Citation

Download Citation

Chao Gao. Zongming Ma. Zhao Ren. Harrison H. Zhou. "Minimax estimation in sparse canonical correlation analysis." Ann. Statist. 43 (5) 2168 - 2197, October 2015. https://doi.org/10.1214/15-AOS1332

Information

Received: 1 May 2014; Revised: 1 February 2015; Published: October 2015
First available in Project Euclid: 16 September 2015

zbMATH: 1327.62340
MathSciNet: MR3396982
Digital Object Identifier: 10.1214/15-AOS1332

Subjects:
Primary: 62H12
Secondary: 62C20

Keywords: Covariance matrix , Minimax rates , Model selection , nuisance parameter , sin-theta theorem , sparse CCA (SCCA)

Rights: Copyright © 2015 Institute of Mathematical Statistics

JOURNAL ARTICLE
30 PAGES


SHARE
Vol.43 • No. 5 • October 2015
Back to Top