Open Access
December 2013 Minimax sparse principal subspace estimation in high dimensions
Vincent Q. Vu, Jing Lei
Ann. Statist. 41(6): 2905-2947 (December 2013). DOI: 10.1214/13-AOS1151

Abstract

We study sparse principal components analysis in high dimensions, where $p$ (the number of variables) can be much larger than $n$ (the number of observations), and analyze the problem of estimating the subspace spanned by the principal eigenvectors of the population covariance matrix. We introduce two complementary notions of $\ell_{q}$ subspace sparsity: row sparsity and column sparsity. We prove nonasymptotic lower and upper bounds on the minimax subspace estimation error for $0\leq q\leq1$. The bounds are optimal for row sparse subspaces and nearly optimal for column sparse subspaces, they apply to general classes of covariance matrices, and they show that $\ell_{q}$ constrained estimates can achieve optimal minimax rates without restrictive spiked covariance conditions. Interestingly, the form of the rates matches known results for sparse regression when the effective noise variance is defined appropriately. Our proof employs a novel variational $\sin\Theta$ theorem that may be useful in other regularized spectral estimation problems.

Citation

Download Citation

Vincent Q. Vu. Jing Lei. "Minimax sparse principal subspace estimation in high dimensions." Ann. Statist. 41 (6) 2905 - 2947, December 2013. https://doi.org/10.1214/13-AOS1151

Information

Published: December 2013
First available in Project Euclid: 1 January 2014

zbMATH: 1288.62103
MathSciNet: MR3161452
Digital Object Identifier: 10.1214/13-AOS1151

Subjects:
Primary: 62H25
Secondary: 62C20 , 62H12

Keywords: High-dimensional statistics , minimax bounds , principal components analysis , random matrices , Sparsity , subspace estimation

Rights: Copyright © 2013 Institute of Mathematical Statistics

Vol.41 • No. 6 • December 2013
Back to Top