The Annals of Statistics

Characterizing $L_{2}$Boosting

John Ehrlinger and Hemant Ishwaran

Full-text: Open access


We consider $L_{2}$Boosting, a special case of Friedman’s generic boosting algorithm applied to linear regression under $L_{2}$-loss. We study $L_{2}$Boosting for an arbitrary regularization parameter and derive an exact closed form expression for the number of steps taken along a fixed coordinate direction. This relationship is used to describe $L_{2}$Boosting’s solution path, to describe new tools for studying its path, and to characterize some of the algorithm’s unique properties, including active set cycling, a property where the algorithm spends lengthy periods of time cycling between the same coordinates when the regularization parameter is arbitrarily small. Our fixed descent analysis also reveals a repressible condition that limits the effectiveness of $L_{2}$Boosting in correlated problems by preventing desirable variables from entering the solution path. As a simple remedy, a data augmentation method similar to that used for the elastic net is used to introduce $L_{2}$-penalization and is shown, in combination with decorrelation, to reverse the repressible condition and circumvents $L_{2}$Boosting’s deficiencies in correlated problems. In itself, this presents a new explanation for why the elastic net is successful in correlated problems and why methods like LAR and lasso can perform poorly in such settings.

Article information

Ann. Statist., Volume 40, Number 2 (2012), 1074-1101.

First available in Project Euclid: 18 July 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62J05: Linear regression
Secondary: 62J99: None of the above, but in this section

Critical direction gradient-correlation regularization repressibility solution path


Ehrlinger, John; Ishwaran, Hemant. Characterizing $L_{2}$Boosting. Ann. Statist. 40 (2012), no. 2, 1074--1101. doi:10.1214/12-AOS997.

Export citation


  • Bühlmann, P. (2006). Boosting for high-dimensional linear models. Ann. Statist. 34 559–583.
  • Bühlmann, P. and Yu, B. (2003). Boosting with the $L_2$ loss: Regression and classification. J. Amer. Statist. Assoc. 98 324–339.
  • Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression (with discussion, and a rejoinder by the authors). Ann. Statist. 32 407–499.
  • Ehrlinger, J. (2011). Regularization: Stagewise regression and bagging. Ph.D. thesis, Case Western Reserve Univ., Cleveland, OH.
  • Ehrlinger, J. and Ishwaran, H. (2012). Supplement to “Characterizing $L_2$Boosting.” DOI:10.1214/12-AOS997SUPP.
  • Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Ann. Statist. 29 1189–1232.
  • Hastie, T. (2007). Comment on “Boosting algorithms: Regularization, prediction and model fitting.” Statist. Sci. 22 513–515.
  • Hastie, T., Taylor, J., Tibshirani, R. and Walther, G. (2007). Forward stagewise regression and the monotone lasso. Electron. J. Stat. 1 1–29.
  • Mallat, S. and Zhang, Z. (1993). Matching pursuits with time–frequency dictionaries. IEEE Trans. Signal Proc. 41 3397–3415.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
  • Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 301–320.

Supplemental materials

  • Supplementary material: Proofs of results from “Characterizing $L_{2}$Boosting”. An online supplementary file contains the detailed proofs for Theorems 1 through 9. These proofs make use of various notation described in the paper.