The Annals of Statistics

Adaptive piecewise polynomial estimation via trend filtering

Ryan J. Tibshirani

Full-text: Open access


We study trend filtering, a recently proposed tool of Kim et al. [SIAM Rev. 51 (2009) 339–360] for nonparametric regression. The trend filtering estimate is defined as the minimizer of a penalized least squares criterion, in which the penalty term sums the absolute $k$th order discrete derivatives over the input points. Perhaps not surprisingly, trend filtering estimates appear to have the structure of $k$th degree spline functions, with adaptively chosen knot points (we say “appear” here as trend filtering estimates are not really functions over continuous domains, and are only defined over the discrete set of inputs). This brings to mind comparisons to other nonparametric regression tools that also produce adaptive splines; in particular, we compare trend filtering to smoothing splines, which penalize the sum of squared derivatives across input points, and to locally adaptive regression splines [Ann. Statist. 25 (1997) 387–413], which penalize the total variation of the $k$th derivative. Empirically, we discover that trend filtering estimates adapt to the local level of smoothness much better than smoothing splines, and further, they exhibit a remarkable similarity to locally adaptive regression splines. We also provide theoretical support for these empirical findings; most notably, we prove that (with the right choice of tuning parameter) the trend filtering estimate converges to the true underlying function at the minimax rate for functions whose $k$th derivative is of bounded variation. This is done via an asymptotic pairing of trend filtering and locally adaptive regression splines, which have already been shown to converge at the minimax rate [Ann. Statist. 25 (1997) 387–413]. At the core of this argument is a new result tying together the fitted values of two lasso problems that share the same outcome vector, but have different predictor matrices.

Article information

Ann. Statist., Volume 42, Number 1 (2014), 285-323.

First available in Project Euclid: 19 March 2014

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G08: Nonparametric regression 62G20: Asymptotic properties

Trend filtering nonparametric regression smoothing splines locally adaptive regression splines minimax convergence rate lasso stability


Tibshirani, Ryan J. Adaptive piecewise polynomial estimation via trend filtering. Ann. Statist. 42 (2014), no. 1, 285--323. doi:10.1214/13-AOS1189.

Export citation


  • Cohen, A., Daubechies, I. and Vial, P. (1993). Wavelets on the interval and fast wavelet transforms. Appl. Comput. Harmon. Anal. 1 54–81.
  • de Boor, C. (1978). A Practical Guide to Splines. Applied Mathematical Sciences 27. Springer, New York.
  • DeVore, R. A. and Lorentz, G. G. (1993). Constructive Approximation. Springer, Berlin.
  • Donoho, D. L. and Johnstone, I. M. (1995). Adapting to unknown smoothness via wavelet shrinkage. J. Amer. Statist. Assoc. 90 1200–1224.
  • Donoho, D. L. and Johnstone, I. M. (1998). Minimax estimation via wavelet shrinkage. Ann. Statist. 26 879–921.
  • Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407–499.
  • Elad, M., Milanfar, P. and Rubinstein, R. (2007). Analysis versus synthesis in signal priors. Inverse Problems 23 947–968.
  • Green, P. J. and Silverman, B. W. (1994). Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach. Chapman & Hall, London.
  • Green, P. J. and Yandell, B. W. (1985). Semi-parametric generalized linear models. In GLIM85: Proceedings of the International Conference on Generalized Linear Models, September 1985 (R. Gilchrist, ed.). Lecture Notes in Statistics 32 44–55. Springer, New York.
  • Hastie, T. J. and Tibshirani, R. J. (1990). Generalized Additive Models. Chapman & Hall, London.
  • Hastie, T., Tibshirani, R. and Friedman, J. (2008). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. Springer, New York.
  • Johnstone, I. (2011). Gaussian estimation: Sequence and wavelet models. Under contract to Cambridge Univ. Press. Online version at
  • Kim, S.-J., Koh, K., Boyd, S. and Gorinevsky, D. (2009). $l_1$ trend filtering. SIAM Rev. 51 339–360.
  • Mallat, S. (2008). A Wavelet Tour of Signal Processing, 3rd ed. Elsevier/Academic Press, Amsterdam.
  • Mammen, E. and van de Geer, S. (1997). Locally adaptive regression splines. Ann. Statist. 25 387–413.
  • Nussbaum, M. (1985). Spline smoothing in regression models and asymptotic efficiency in $L_2$. Ann. Statist. 13 984–997.
  • Osborne, M. R., Presnell, B. and Turlach, B. A. (1998). Knot selection for regression splines via the Lasso. In Dimension Reduction, Computational Complexity, and Information (S. Weisberg, ed.). Computing Science and Statistics 30 44–49. Interface Foundation of North America, Inc., Fairfax Station, VA.
  • Rosset, S. and Zhu, J. (2007). Piecewise linear regularized solution paths. Ann. Statist. 35 1012–1030.
  • Rudin, L. I., Osher, S. and Faterni, E. (1992). Nonlinear total variation based noise removal algorithms. Phys. D 60 259–268.
  • Tibshirani, R. J. (2014). Supplement to “Adaptive piecewise polynomial estimation via trend filtering.” DOI:10.1214/13-AOS1189SUPP.
  • Tibshirani, R. J. and Arnold, T. (2013). Efficient implementations of the generalized lasso dual path algorithm. Unpublished manuscript.
  • Tibshirani, R. J. and Taylor, J. (2011). The solution path of the generalized lasso. Ann. Statist. 39 1335–1371.
  • Tibshirani, R. J. and Taylor, J. (2012). Degrees of freedom in lasso problems. Ann. Statist. 40 1198–1232.
  • Tibshirani, R., Saunders, M., Rosset, S., Zhu, J. and Knight, K. (2005). Sparsity and smoothness via the fused lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 91–108.
  • Wahba, G. (1990). Spline Models for Observational Data. SIAM, Philadelphia, PA.
  • Wang, X., Du, P. and Shen, J. (2013). Smoothing splines with varying smoothing parameter. Biometrika 100 955–970.

Supplemental materials

  • Supplementary material: Supplement to “Adaptive piecewise polynomial estimation via trend filtering”. We provide proofs for the results in Sections 3 and 4. We also present the underlying theoretical framework needed to establish the convergence rates in Section 5. Finally, we discuss an extension of trend filtering to the case of arbitrary input points.