Open Access
Translator Disclaimer
October 2016 Statistical and computational trade-offs in estimation of sparse principal components
Tengyao Wang, Quentin Berthet, Richard J. Samworth
Ann. Statist. 44(5): 1896-1930 (October 2016). DOI: 10.1214/15-AOS1369


In recent years, sparse principal component analysis has emerged as an extremely popular dimension reduction technique for high-dimensional data. The theoretical challenge, in the simplest case, is to estimate the leading eigenvector of a population covariance matrix under the assumption that this eigenvector is sparse. An impressive range of estimators have been proposed; some of these are fast to compute, while others are known to achieve the minimax optimal rate over certain Gaussian or sub-Gaussian classes. In this paper, we show that, under a widely-believed assumption from computational complexity theory, there is a fundamental trade-off between statistical and computational performance in this problem. More precisely, working with new, larger classes satisfying a restricted covariance concentration condition, we show that there is an effective sample size regime in which no randomised polynomial time algorithm can achieve the minimax optimal rate. We also study the theoretical performance of a (polynomial time) variant of the well-known semidefinite relaxation estimator, revealing a subtle interplay between statistical and computational efficiency.


Download Citation

Tengyao Wang. Quentin Berthet. Richard J. Samworth. "Statistical and computational trade-offs in estimation of sparse principal components." Ann. Statist. 44 (5) 1896 - 1930, October 2016.


Received: 1 May 2015; Revised: 1 July 2015; Published: October 2016
First available in Project Euclid: 12 September 2016

zbMATH: 1349.62254
MathSciNet: MR3546438
Digital Object Identifier: 10.1214/15-AOS1369

Primary: 62H25 , 68Q17

Keywords: Computational lower bounds , planted clique problem , polynomial time algorithm , sparse principal component analysis

Rights: Copyright © 2016 Institute of Mathematical Statistics


Vol.44 • No. 5 • October 2016
Back to Top