Rate-optimal perturbation bounds for singular subspaces with applications to high-dimensional statistics

T. Tony Cai; Anru Zhang

doi:10.1214/17-AOS1541

February 2018 Rate-optimal perturbation bounds for singular subspaces with applications to high-dimensional statistics

T. Tony Cai, Anru Zhang

Ann. Statist. 46(1): 60-89 (February 2018). DOI: 10.1214/17-AOS1541

Abstract

Perturbation bounds for singular spaces, in particular Wedin’s $\mathop{\mathrm{sin}}\nolimits \Theta$ theorem, are a fundamental tool in many fields including high-dimensional statistics, machine learning and applied mathematics. In this paper, we establish separate perturbation bounds, measured in both spectral and Frobenius $\mathop{\mathrm{sin}}\nolimits \Theta$ distances, for the left and right singular subspaces. Lower bounds, which show that the individual perturbation bounds are rate-optimal, are also given.

The new perturbation bounds are applicable to a wide range of problems. In this paper, we consider in detail applications to low-rank matrix denoising and singular space estimation, high-dimensional clustering and canonical correlation analysis (CCA). In particular, separate matching upper and lower bounds are obtained for estimating the left and right singular spaces. To the best of our knowledge, this is the first result that gives different optimal rates for the left and right singular spaces under the same perturbation.

Citation

Download Citation

T. Tony Cai. Anru Zhang. "Rate-optimal perturbation bounds for singular subspaces with applications to high-dimensional statistics." Ann. Statist. 46 (1) 60 - 89, February 2018. https://doi.org/10.1214/17-AOS1541

Information

Received: 1 May 2016; Revised: 1 November 2016; Published: February 2018

First available in Project Euclid: 22 February 2018

zbMATH: 06865105

MathSciNet: MR3766946

Digital Object Identifier: 10.1214/17-AOS1541

Subjects:

Primary: 62C20 , 62H12

Secondary: 62H25

Keywords: $\mathop{\mathrm{sin}}\nolimits \Theta$ distances , canonical correlation analysis , clustering , High-dimensional statistics , low-rank matrix denoising , perturbation bound , Singular value decomposition , Spectral method