The Annals of Statistics

Robust recovery of multiple subspaces by geometric lp minimization

Gilad Lerman and Teng Zhang

Full-text: Open access


We assume i.i.d. data sampled from a mixture distribution with K components along fixed d-dimensional linear subspaces and an additional outlier component. For p > 0, we study the simultaneous recovery of the K fixed subspaces by minimizing the lp-averaged distances of the sampled data points from any K subspaces. Under some conditions, we show that if 0 < p ≤ 1, then all underlying subspaces can be precisely recovered by lp minimization with overwhelming probability. On the other hand, if K > 1 and p > 1, then the underlying subspaces cannot be recovered or even nearly recovered by lp minimization. The results of this paper partially explain the successes and failures of the basic approach of lp energy minimization for modeling data by multiple subspaces.

Article information

Ann. Statist., Volume 39, Number 5 (2011), 2686-2715.

First available in Project Euclid: 22 December 2011

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20] 62G35: Robustness 68Q32: Computational learning theory [See also 68T05]

Detection clustering multiple subspaces hybrid linear modeling optimization on the Grassmannian robustness geometric probability high-dimensional data


Lerman, Gilad; Zhang, Teng. Robust recovery of multiple subspaces by geometric l p minimization. Ann. Statist. 39 (2011), no. 5, 2686--2715. doi:10.1214/11-AOS914.

Export citation


  • [1] Aldroubi, A., Cabrelli, C. and Molter, U. (2008). Optimal non-linear models for sparsity and sampling. J. Fourier Anal. Appl. 14 793–812.
  • [2] Anderson, T. W. (1984). An Introduction to Multivariate Statistical Analysis, 2nd ed. Wiley, New York.
  • [3] Arias-Castro, E., Chen, G. and Lerman, G. (2011). Spectral clustering based on local linear approximations. Electron. J. Statist. 5 1537–1587.
  • [4] Bendich, P., Wang, B. and Mukherjee, S. (2010). Towards stratification learning through homology inference. Available at
  • [5] Bradley, P. S. and Mangasarian, O. L. (2000). k-plane clustering. J. Global Optim. 16 23–32.
  • [6] Candès, E. J., Li, X., Ma, Y. and Wright, J. (2009). Robust principal component analysis? Unpublished manuscript. Available at arXiv:0912.3599.
  • [7] Chen, G. and Lerman, G. (2009). Foundations of a multi-way spectral clustering framework for hybrid linear modeling. Found. Comput. Math. 9 517–558.
  • [8] Chen, G. and Lerman, G. (2009). Spectral curvature clustering (SCC). Int. J. Comput. Vision 81 317–330.
  • [9] Costeira, J. and Kanade, T. (1998). A multibody factorization method for independently moving objects. Int. J. Comput. Vis. 29 159–179.
  • [10] Ho, J., Yang, M., Lim, J., Lee, K. and Kriegman, D. (2003). Clustering appearances of objects under varying illumination conditions. In Proceedings of International Conference on Computer Vision and Pattern Recognition 1 11–18. IEEE Computer Society, Madison, WI.
  • [11] Kanatani, K. (2001). Motion segmentation by subspace separation and model selection. In Proc. of 8th ICCV 3 586–591. IEEE, Vancouver, Canada.
  • [12] Lerman, G. and Zhang, T. (2010). lp-Recovery of the most significant subspace among multiple subspaces with outliers. Unpublished manuscript. Available at
  • [13] Ma, Y., Derksen, H., Hong, W. and Wright, J. (2007). Segmentation of multivariate mixed data via lossy coding and compression. IEEE Transactions on Pattern Analysis and Machine Intelligence 29 1546–1562.
  • [14] Ma, Y., Yang, A. Y., Derksen, H. and Fossum, R. (2008). Estimation of subspace arrangements with applications in modeling and segmenting mixed data. SIAM Rev. 50 413–458.
  • [15] Mattila, P. (1995). Geometry of Sets and Measures in Euclidean Spaces: Fractals and Rectifiability. Cambridge Studies in Advanced Mathematics 44. Cambridge Univ. Press, Cambridge.
  • [16] Pollard, D. (1981). Strong consistency of k-means clustering. Ann. Statist. 9 135–140.
  • [17] Pollard, D. (1982). A central limit theorem for k-means clustering. Ann. Probab. 10 919–926.
  • [18] Shawe-Taylor, J., Williams, C. K. I., Cristianini, N. and Kandola, J. (2005). On the eigenspectrum of the Gram matrix and the generalization error of kernel-PCA. IEEE Trans. Inform. Theory 51 2510–2522.
  • [19] Tao, T. (2011). Topics in random matrix theory. Available at
  • [20] Tipping, M. and Bishop, C. (1999). Mixtures of probabilistic principal component analysers. Neural Comput. 11 443–482.
  • [21] Torr, P. H. S. (1998). Geometric motion segmentation and model selection. R. Soc. Lond. Philos. Trans. Ser. A Math. Phys. Eng. Sci. 356 1321–1340.
  • [22] Tseng, P. (2000). Nearest q-flat to m points. J. Optim. Theory Appl. 105 249–252.
  • [23] Vidal, R., Ma, Y. and Sastry, S. (2005). Generalized principal component analysis (GPCA). IEEE Trans. Pattern Anal. Mach. Intell. 27 1945–1959.
  • [24] Yan, J. and Pollefeys, M. (2006). A general framework for motion segmentation: Independent, articulated, rigid, non-rigid, degenerate and nondegenerate. In ECCV 4 94–106.
  • [25] Zhang, T., Szlam, A. and Lerman, G. (2009). Median K-flats for hybrid linear modeling with many outliers. In Computer Vision Workshops (ICCV Workshops), IEEE 12th International Conference on Computer Vision 234–241. IEEE, Tokyo, Japan.
  • [26] Zhang, T., Szlam, A., Wang, Y. and Lerman, G. (2010). Hybrid linear modeling via local best-fit flats. Available at
  • [27] Zhang, T., Szlam, A., Wang, Y. and Lerman, G. (2010). Randomized hybrid linear modeling by local best-fit flats. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 1927–1934. IEEE, San Francisco, CA.