Electronic Journal of Statistics

A sharp oracle inequality for Graph-Slope

Pierre C. Bellec, Joseph Salmon, and Samuel Vaiter

Full-text: Open access

Abstract

Following recent success on the analysis of the Slope estimator, we provide a sharp oracle inequality in term of prediction error for Graph-Slope, a generalization of Slope to signals observed over a graph. In addition to improving upon best results obtained so far for the Total Variation denoiser (also referred to as Graph-Lasso or Generalized Lasso), we propose an efficient algorithm to compute Graph-Slope. The proposed algorithm is obtained by applying the forward-backward method to the dual formulation of the Graph-Slope optimization problem. We also provide experiments showing the practical applicability of the method.

Article information

Source
Electron. J. Statist., Volume 11, Number 2 (2017), 4851-4870.

Dates
Received: June 2017
First available in Project Euclid: 30 November 2017

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1512032447

Digital Object Identifier
doi:10.1214/17-EJS1364

Mathematical Reviews number (MathSciNet)
MR3732915

Zentralblatt MATH identifier
1382.62014

Subjects
Primary: 62G08: Nonparametric regression
Secondary: 62J07: Ridge regression; shrinkage estimators

Keywords
Denoising graph signal regularization oracle inequality convex optimization

Rights
Creative Commons Attribution 4.0 International License.

Citation

Bellec, Pierre C.; Salmon, Joseph; Vaiter, Samuel. A sharp oracle inequality for Graph-Slope. Electron. J. Statist. 11 (2017), no. 2, 4851--4870. doi:10.1214/17-EJS1364. https://projecteuclid.org/euclid.ejs/1512032447


Export citation

References

  • [1] I. E. Auger and C. E. Lawrence. Algorithms for the optimal identification of segment neighborhoods., Bull. Math. Biol., 51(1):39–54, 1989.
  • [2] H. H. Bauschke and P. L. Combettes., Convex analysis and monotone operator theory in Hilbert spaces. Springer, New York, 2011.
  • [3] A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems., SIAM J. Imaging Sci., 2(1):183–202, 2009.
  • [4] P. C Bellec, G. Lecué, and A. B. Tsybakov. Slope meets lasso: improved oracle bounds and optimality., arXiv preprint arXiv :1605.08651, 2016.
  • [5] A. Belloni, V. Chernozhukov, and L. Wang. Square-root Lasso: pivotal recovery of sparse signals via conic programming., Biometrika, 98(4):791–806, 2011.
  • [6] M. J. Best and N. Chakravarti. Active set algorithms for isotonic regression; a unifying framework., Math. Program., 47(1-3):425–439, 1990.
  • [7] R. Bhatia., Matrix analysis, volume 169 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1997.
  • [8] P. J. Bickel, Y. Ritov, and A. B. Tsybakov. Simultaneous analysis of lasso and dantzig selector., Ann. Statist., 37(4) :1705–1732, 08 2009.
  • [9] G. Boeing. OSMnx: New Methods for Acquiring, Constructing, Analyzing, and Visualizing Complex Street Networks., ArXiv e-prints ArXiv :1611.01890, 2016.
  • [10] M. Bogdan, E. van den Berg, C. Sabatti, W. Su, and E. J. Candès. SLOPE-adaptive variable selection via convex optimization., Ann. Appl. Stat., 9(3) :1103, 2015.
  • [11] S. Boucheron, G. Lugosi, and P. Massart., Concentration inequalities: A nonasymptotic theory of independence. Oxford University Press, 2013.
  • [12] P. L. Combettes and J.-C. Pesquet. Proximal splitting methods in signal processing. In, Fixed-point algorithms for inverse problems in science and engineering, volume 49 of Springer Optim. Appl., pages 185–212. Springer, New York, 2011.
  • [13] A. S. Dalalyan, M. Hebiri, and J. Lederer. On the prediction performance of the Lasso., Bernoulli, 23(1):552–581, 2017.
  • [14] C.-A. Deledalle, N. Papadakis, J. Salmon, and S. Vaiter. CLEAR: Covariant LEAst-square Re-fitting with applications to image restoration., SIAM J. Imaging Sci., 10(1):243–284, 2017.
  • [15] M. Elad, P. Milanfar, and R. Rubinstein. Analysis versus synthesis in signal priors., Inverse problems, 23(3):947–968, 2007.
  • [16] Z. Fan and L. Guan. $\ell_0$-estimation of piecewise-constant signals on graphs., ArXiv e-prints ArXiv :1703.01421, 2017.
  • [17] J.-C. Hütter and P. Rigollet. Optimal rates for total variation denoising., ArXiv e-prints ArXiv :1603.09388, 2016.
  • [18] E. Mammen and S. van de Geer. Locally adaptive regression splines., Ann. Statist., 25(1):387–413, 1997.
  • [19] E. Ndiaye, O. Fercoq, A. Gramfort, V. Leclère, and J. Salmon. Efficient smoothed concomitant lasso estimation for high dimensional regression. In, NCMIP, 2017.
  • [20] A. Y. Ng, M. I. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In, NIPS, volume 14, pages 849–856, 2001.
  • [21] A. B. Owen. A robust hybrid of lasso and ridge regression., Contemporary Mathematics, 443:59–72, 2007.
  • [22] N. Parikh, S. Boyd, E. Chu, B. Peleato, and J. Eckstein. Proximal algorithms., Foundations and Trends in Machine Learning, 1(3):1–108, 2013.
  • [23] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python., J. Mach. Learn. Res., 12 :2825–2830, 2011.
  • [24] L. I. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noise removal algorithms., Phys. D, 60(1-4):259–268, 1992.
  • [25] V. Sadhanala, Y.-X. Wang, and R. J. Tibshirani. Total variation classes beyond 1d: Minimax rates, and the limitations of linear smoothers. In, NIPS, pages 3513–3521, 2016.
  • [26] J. Sharpnack, A. Singh, and A. Rinaldo. Sparsistency of the edge lasso over graphs. In, AISTATS, volume 22, pages 1028–1036, 2012.
  • [27] J. Shi and J. Malik. Normalized cuts and image segmentation., IEEE Trans. Pattern Anal. Mach. Intell., 22(8):888–905, 2000.
  • [28] W. Su and E. J. Candès. Slope is adaptive to unknown sparsity and asymptotically minimax., Ann. Statist., 44(3) :1038–1068, 2016.
  • [29] T. Sun and C.-H. Zhang. Scaled sparse linear regression., Biometrika, 99(4):879–898, 2012.
  • [30] R. Tibshirani, M. A. Saunders, S. Rosset, J. Zhu, and K. Knight. Sparsity and smoothness via the fused LASSO., J. R. Stat. Soc. Ser. B Stat. Methodol., 67(1):91–108, 2005.
  • [31] S. van de Geer and P. Bühlmann. On the conditions used to prove oracle results for the Lasso. 3 :1360–1392, 2009.
  • [32] V. Viallon, S. Lambert-Lacroix, H. Hoefling, and F. Picard. On the robustness of the generalized fused lasso to prior specifications., Statistics and Computing, 26(1-2):285–301, 2016.
  • [33] U. Von Luxburg. A tutorial on spectral clustering., Statistics and computing, 17(4):395–416, 2007.
  • [34] D. J. Watts. Networks, dynamics, and the small-world phenomenon., American Journal of Sociology, 105(2):493–527, 1999.
  • [35] F. Ye and C.-H. Zhang. Rate minimaxity of the lasso and dantzig selector for the lq loss in lr balls., J. Mach. Learn. Res., 11(Dec) :3519–3540, 2010.
  • [36] X. Zeng and M. A. T. Figueiredo. The Ordered Weighted $\ell_1$ Norm: Atomic Formulation, Projections, and Algorithms., ArXiv e-prints arXiv :1409.4271, 2014.