The Annals of Statistics

Geometric inference for general high-dimensional linear inverse problems

Abstract

This paper presents a unified geometric framework for the statistical analysis of a general ill-posed linear inverse model which includes as special cases noisy compressed sensing, sign vector recovery, trace regression, orthogonal matrix estimation and noisy matrix completion. We propose computationally feasible convex programs for statistical inference including estimation, confidence intervals and hypothesis testing. A theoretical framework is developed to characterize the local estimation rate of convergence and to provide statistical inference guarantees. Our results are built based on the local conic geometry and duality. The difficulty of statistical inference is captured by the geometric characterization of the local tangent cone through the Gaussian width and Sudakov estimate.

Article information

Source
Ann. Statist., Volume 44, Number 4 (2016), 1536-1563.

Dates
Revised: December 2015
First available in Project Euclid: 7 July 2016

https://projecteuclid.org/euclid.aos/1467894707

Digital Object Identifier
doi:10.1214/15-AOS1426

Mathematical Reviews number (MathSciNet)
MR3519932

Zentralblatt MATH identifier
1357.62235

Citation

Cai, T. Tony; Liang, Tengyuan; Rakhlin, Alexander. Geometric inference for general high-dimensional linear inverse problems. Ann. Statist. 44 (2016), no. 4, 1536--1563. doi:10.1214/15-AOS1426. https://projecteuclid.org/euclid.aos/1467894707

References

• [1] Amelunxen, D., Lotz, M., McCoy, M. B. and Tropp, J. A. (2013). Living on the edge: A geometric theory of phase transitions in convex optimization. Preprint. Available at arXiv:1303.6672.
• [2] Bickel, P. J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37 1705–1732.
• [3] Bühlmann, P. (2013). Statistical significance in high-dimensional linear models. Bernoulli 19 1212–1242.
• [4] Bühlmann, P. and van de Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, Heidelberg.
• [5] Cai, T., Liang, T. and Rakhlin, A. (2016). Supplement to “Geometric inference for general high-dimensional linear inverse problems.” DOI:10.1214/15-AOS1426SUPP.
• [6] Cai, T., Liu, W. and Luo, X. (2011). A constrained $\ell_{1}$ minimization approach to sparse precision matrix estimation. J. Amer. Statist. Assoc. 106 594–607.
• [7] Cai, T. T. and Low, M. G. (2004). An adaptation theory for nonparametric confidence intervals. Ann. Statist. 32 1805–1840.
• [8] Cai, T. T. and Low, M. G. (2004). Minimax estimation of linear functionals over nonconvex parameter spaces. Ann. Statist. 32 552–576.
• [9] Cai, T. T. and Zhou, W. (2013). Matrix completion via max-norm constrained optimization. Preprint. Available at arXiv:1303.0341.
• [10] Candès, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when $p$ is much larger than $n$. Ann. Statist. 35 2313–2351.
• [11] Candès, E. J. and Davenport, M. A. (2013). How well can we estimate a sparse vector? Appl. Comput. Harmon. Anal. 34 317–323.
• [12] Candès, E. J., Li, X., Ma, Y. and Wright, J. (2011). Robust principal component analysis? J. ACM 58 Art. 11, 37.
• [13] Candes, E. J. and Plan, Y. (2010). Matrix completion with noise. Proceedings of the IEEE 98 925–936.
• [14] Candès, E. J. and Plan, Y. (2011). Tight oracle inequalities for low-rank matrix recovery from a minimal number of noisy random measurements. IEEE Trans. Inform. Theory 57 2342–2359.
• [15] Candès, E. J. and Recht, B. (2009). Exact matrix completion via convex optimization. Found. Comput. Math. 9 717–772.
• [16] Chandrasekaran, V., Recht, B., Parrilo, P. A. and Willsky, A. S. (2012). The convex geometry of linear inverse problems. Found. Comput. Math. 12 805–849.
• [17] Chatterjee, S. (2015). Matrix estimation by universal singular value thresholding. Ann. Statist. 43 177–214.
• [18] Donoho, D. L. (1994). Statistical estimation and optimal recovery. Ann. Statist. 22 238–270.
• [19] Dudley, R. M. (1967). The sizes of compact subsets of Hilbert space and continuity of Gaussian processes. J. Funct. Anal. 1 290–330.
• [20] Gordon, Y. (1988). On Milman’s inequality and random subspaces which escape through a mesh in ${\mathbb{R}}^{n}$. In Geometric Aspects of Functional Analysis (1986/87). Lecture Notes in Math. 1317 84–106. Springer, Berlin.
• [21] Gower, J. C. and Dijksterhuis, G. B. (2004). Procrustes Problems. Oxford Statistical Science Series 30. Oxford Univ. Press, Oxford.
• [22] Jagabathula, S. and Shah, D. (2011). Inferring rankings using constrained sensing. IEEE Trans. Inform. Theory 57 7288–7306.
• [23] Javanmard, A. and Montanari, A. (2014). Confidence intervals and hypothesis testing for high-dimensional regression. J. Mach. Learn. Res. 15 2869–2909.
• [24] Johnstone, I. M. and Silverman, B. W. (1990). Speed of estimation in positron emission tomography and related inverse problems. Ann. Statist. 18 251–280.
• [25] Khuri, S., Bäck, T. and Heitkötter, J. (1994). The zero/one multiple knapsack problem and genetic algorithms. In Proceedings of the 1994 ACM Symposium on Applied Computing 188–193. ACM, New York.
• [26] Koltchinskii, V. (2011). Von Neumann entropy penalization and low-rank matrix estimation. Ann. Statist. 39 2936–2973.
• [27] Koltchinskii, V., Lounici, K. and Tsybakov, A. B. (2011). Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Ann. Statist. 39 2302–2329.
• [28] Lecué, G. and Mendelson, S. (2013). Learning subgaussian classes: Upper and minimax bounds. Preprint. Available at arXiv:1305.4825.
• [29] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces: Isoperimetry and Processes. Ergebnisse der Mathematik und Ihrer Grenzgebiete (3) 23. Springer, Berlin.
• [30] Ma, Z. and Wu, Y. (2013). Volume ratio, sparsity, and minimaxity under unitarily invariant norms. Preprint. Available at arXiv:1306.3609.
• [31] Mangasarian, O. L. and Recht, B. (2011). Probability of unique integer solution to a system of linear equations. European J. Oper. Res. 214 27–30.
• [32] Negahban, S. N., Ravikumar, P., Wainwright, M. J. and Yu, B. (2012). A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers. Statist. Sci. 27 538–557.
• [33] O’Sullivan, F. (1986). A statistical perspective on ill-posed inverse problems. Statist. Sci. 1 502–527.
• [34] Oymak, S., Thrampoulidis, C. and Hassibi, B. (2013). Simple bounds for noisy linear inverse problems with exact side information. Preprint. Available at arXiv:1312.0641.
• [35] Pisier, G. (1989). The Volume of Convex Bodies and Banach Space Geometry. Cambridge Tracts in Mathematics 94. Cambridge Univ. Press, Cambridge.
• [36] Prokopyev, O. A., Huang, H. and Pardalos, P. M. (2005). On complexity of unconstrained hyperbolic 0–1 programming problems. Oper. Res. Lett. 33 312–318.
• [37] Recht, B., Fazel, M. and Parrilo, P. A. (2010). Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52 471–501.
• [38] Rohde, A. and Tsybakov, A. B. (2011). Estimation of high-dimensional low-rank matrices. Ann. Statist. 39 887–930.
• [39] Talagrand, M. (1996). Majorizing measures: The generic chaining. Ann. Probab. 24 1049–1103.
• [40] ten Berge, J. M. F. (1977). Orthogonal Procrustes rotation for two or more matrices. Psychometrika 42 267–276.
• [41] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B. Stat. Methodol. 58 267–288.
• [42] Tikhonov, A. and Arsenin, V. Y. (1977). Methods for Solving Ill-Posed Problems. Wiley, New York.
• [43] Valiant, L. G. and Vazirani, V. V. (1986). NP is as easy as detecting unique solutions. Theoret. Comput. Sci. 47 85–93.
• [44] van de Geer, S., Bühlmann, P., Ritov, Y. and Dezeure, R. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Statist. 42 1166–1202.
• [45] Vershynin, R. (2011). Lectures in geometric functional analysis. Available at http://www-personal.umich.edu/~romanv/papers/GFA-book/GFA-book.pdf.
• [46] Yang, Y. and Barron, A. (1999). Information-theoretic determination of minimax rates of convergence. Ann. Statist. 27 1564–1599.
• [47] Ye, F. and Zhang, C. (2010). Rate minimaxity of the Lasso and Dantzig selector for the $\ell_{q}$ loss in $\ell_{r}$ balls. J. Mach. Learn. Res. 11 3519–3540.
• [48] Zhang, C. and Zhang, S. S. (2014). Confidence intervals for low dimensional parameters in high dimensional linear models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 76 217–242.

Supplemental materials

• “Geometric inference for general high-dimensional linear inverse problems”. Due to space constraints, we have relegated remaining proofs to the Supplement [5], where details of proof for Lemmas 2–4, Theorem 6 and Corollary 1–5 are included.