The Annals of Statistics

Faithful variable screening for high-dimensional convex regression

Min Xu, Minhua Chen, and John Lafferty

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


We study the problem of variable selection in convex nonparametric regression. Under the assumption that the true regression function is convex and sparse, we develop a screening procedure to select a subset of variables that contains the relevant variables. Our approach is a two-stage quadratic programming method that estimates a sum of one-dimensional convex functions, followed by one-dimensional concave regression fits on the residuals. In contrast to previous methods for sparse additive models, the optimization is finite dimensional and requires no tuning parameters for smoothness. Under appropriate assumptions, we prove that the procedure is faithful in the population setting, yielding no false negatives. We give a finite sample statistical analysis, and introduce algorithms for efficiently carrying out the required quadratic programs. The approach leads to computational and statistical advantages over fitting a full model, and provides an effective, practical approach to variable screening in convex regression.

Article information

Ann. Statist., Volume 44, Number 6 (2016), 2624-2660.

Received: November 2014
Revised: December 2015
First available in Project Euclid: 23 November 2016

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G08: Nonparametric regression
Secondary: 52A41: Convex functions and convex programs [See also 26B25, 90C25]

Nonparametric regression convex regression variable selection quadratic programming additive model


Xu, Min; Chen, Minhua; Lafferty, John. Faithful variable screening for high-dimensional convex regression. Ann. Statist. 44 (2016), no. 6, 2624--2660. doi:10.1214/15-AOS1425.

Export citation


  • [1] Bertin, K. and Lecué, G. (2008). Selection of variables and dimension reduction in high-dimensional non-parametric regression. Electron. J. Stat. 2 1224–1241.
  • [2] Boyd, S., Parikh, N., Chu, E., Peleato, B. and Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3 1–122.
  • [3] Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge Univ. Press, Cambridge.
  • [4] Chen, H. and Yao, D. D. (2001). Fundamentals of Queueing Networks: Performance, Asymptotics, and Optimization. Applications of Mathematics (New York) 46. Springer, New York.
  • [5] Chen, Y. and Samworth, R. J. (2014). Generalised additive and index models with shape constraints. Preprint. Available at arXiv:1404.2957.
  • [6] Comminges, L. and Dalalyan, A. S. (2012). Tight conditions for consistency of variable selection in the context of high dimensionality. Ann. Statist. 40 2667–2696.
  • [7] Cule, M., Samworth, R. and Stewart, M. (2010). Maximum likelihood estimation of a multi-dimensional log-concave density. J. R. Stat. Soc. Ser. B. Stat. Methodol. 72 545–607.
  • [8] DeVore, R., Petrova, G. and Wojtaszczyk, P. (2011). Approximation of functions of few variables in high dimensions. Constr. Approx. 33 125–143.
  • [9] Goldenshluger, A. and Zeevi, A. (2006). Recovering convex boundaries from blurred and noisy observations. Ann. Statist. 34 1375–1394.
  • [10] Guntuboyina, A. and Sen, B. (2013). Global risk bounds and adaptation in univariate convex regression. Preprint. Available at arXiv:1305.1648.
  • [11] Hannah, L. A. and Dunson, D. B. (2012). Ensemble methods for convex regression with applications to geometric programming based circuit design. In International Conference on Machine Learning (ICML). Edinburgh.
  • [12] Horn, R. A. and Johnson, C. R. (1990). Matrix Analysis. Cambridge Univ. Press, Cambridge.
  • [13] Huang, J., Horowitz, J. L. and Wei, F. (2010). Variable selection in nonparametric additive models. Ann. Statist. 38 2282–2313.
  • [14] Kim, A. K. and Samworth, R. J. (2014). Global rates of convergence in log-concave density estimation. Preprint. Available at arXiv:1404.2298.
  • [15] Koltchinskii, V. and Yuan, M. (2010). Sparsity in multiple kernel learning. Ann. Statist. 38 3660–3695.
  • [16] Lafferty, J. and Wasserman, L. (2008). Rodeo: Sparse, greedy nonparametric regression. Ann. Statist. 36 28–63.
  • [17] Lele, A. S., Kulkarni, S. R. and Willsky, A. S. (1992). Convex-polygon estimation from support-line measurements and applications to target reconstruction from laser-radar data. Journal of the Optical Society of America, Series A 9 1693–1714.
  • [18] Lim, E. and Glynn, P. W. (2012). Consistency of multidimensional convex regression. Oper. Res. 60 196–208.
  • [19] Liu, H. and Chen, X. (2009). Nonparametric greedy algorithm for the sparse learning problems. In Advances in Neural Information Processing Systems 22.
  • [20] Meyer, R. F. and Pratt, J. W. (1968). The consistent assessment and fairing of preference functions. IEEE Trans. Systems Sci. Cybernetics 4 270–278.
  • [21] Mossel, E., O’Donnell, R. and Servedio, R. A. (2004). Learning functions of $k$ relevant variables. J. Comput. System Sci. 69 421–434.
  • [22] Prince, J. L. and Willsky, A. S. (1990). Reconstructing convex sets from support line measurements. IEEE Trans. Pattern Anal. Mach. Intell. 12 377–389.
  • [23] Pya, N. and Wood, S. N. (2015). Shape constrained additive models. Stat. Comput. 25 543–559.
  • [24] Raskutti, G., Wainwright, M. J. and Yu, B. (2012). Minimax-optimal rates for sparse additive models over kernel classes via convex programming. J. Mach. Learn. Res. 13 389–427.
  • [25] Ravikumar, P., Liu, H., Lafferty, J. and Wasserman, L. (2007). Spam: Sparse additive models. In Advances in Neural Information Processing Systems 20.
  • [26] Seijo, E. and Sen, B. (2011). Nonparametric least squares estimation of a multivariate convex regression function. Ann. Statist. 39 1633–1657.
  • [27] Wainwright, M. J. (2009). Sharp thresholds for high-dimensional and noisy sparsity recovery using $\ell_{1}$-constrained quadratic programming (Lasso). IEEE Trans. Inform. Theory 55 2183–2202.
  • [28] Xu, M., Chen, M. and Lafferty, J. (2016). Supplement to “Faithful variable screening for high-dimensional convex regression.” DOI:10.1214/15-AOS1425SUPP.

Supplemental materials

  • Supplement to “Faithful variable screening for high-dimensional convex regression”. The supplement provides detailed proofs of certain technical results, together with further explanation of the Gaussian example and simplifications when the density is a product.