Characterizing $L_{2}$Boosting

John Ehrlinger; Hemant Ishwaran

doi:10.1214/12-AOS997

April 2012 Characterizing $L_{2}$Boosting

John Ehrlinger, Hemant Ishwaran

Ann. Statist. 40(2): 1074-1101 (April 2012). DOI: 10.1214/12-AOS997

Abstract

We consider $L_{2}$Boosting, a special case of Friedman’s generic boosting algorithm applied to linear regression under $L_{2}$-loss. We study $L_{2}$Boosting for an arbitrary regularization parameter and derive an exact closed form expression for the number of steps taken along a fixed coordinate direction. This relationship is used to describe $L_{2}$Boosting’s solution path, to describe new tools for studying its path, and to characterize some of the algorithm’s unique properties, including active set cycling, a property where the algorithm spends lengthy periods of time cycling between the same coordinates when the regularization parameter is arbitrarily small. Our fixed descent analysis also reveals a repressible condition that limits the effectiveness of $L_{2}$Boosting in correlated problems by preventing desirable variables from entering the solution path. As a simple remedy, a data augmentation method similar to that used for the elastic net is used to introduce $L_{2}$-penalization and is shown, in combination with decorrelation, to reverse the repressible condition and circumvents $L_{2}$Boosting’s deficiencies in correlated problems. In itself, this presents a new explanation for why the elastic net is successful in correlated problems and why methods like LAR and lasso can perform poorly in such settings.