Electronic Journal of Statistics

Robust regression through the Huber’s criterion and adaptive lasso penalty

Sophie Lambert-Lacroix and Laurent Zwald

Full-text: Open access


The Huber’s Criterion is a useful method for robust regression. The adaptive least absolute shrinkage and selection operator (lasso) is a popular technique for simultaneous estimation and variable selection. The adaptive weights in the adaptive lasso allow to have the oracle properties. In this paper we propose to combine the Huber’s criterion and adaptive penalty as lasso. This regression technique is resistant to heavy-tailed errors or outliers in the response. Furthermore, we show that the estimator associated with this procedure enjoys the oracle properties. This approach is compared with LAD-lasso based on least absolute deviation with adaptive lasso. Extensive simulation studies demonstrate satisfactory finite-sample performance of such procedure. A real example is analyzed for illustration purposes.

Article information

Electron. J. Statist., Volume 5 (2011), 1015-1053.

First available in Project Euclid: 15 September 2011

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Adaptive lasso concomitant scale Huber’s criterion oracle property robust estimation


Lambert-Lacroix, Sophie; Zwald, Laurent. Robust regression through the Huber’s criterion and adaptive lasso penalty. Electron. J. Statist. 5 (2011), 1015--1053. doi:10.1214/11-EJS635. https://projecteuclid.org/euclid.ejs/1316092867

Export citation


  • [1] P. Andersen and R. Gill. Cox’s regression model for counting processes: A large sample study., Ann. Stat., 10 :1100–1120, 1982.
  • [2] H. Attouch., Variational Convergence of Functions and Operators. Pitman, Boston, 1984.
  • [3] Z. Bai, C. Rao, and Y. Wu., M-estimation of multivariate linear regression parameters under a convex discrepancy function. Statistica Sinica, 2(1):237–254, 1992.
  • [4] J. Fan and R. Li. Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties., Journal of the American Statistical Association, 96 :1438–1360, 2001.
  • [5] I. Gannaz. Estimation par ondelettes dans les modeles partiellement linéaires., Thesis of the University Joseph Fourier (Grenoble I), 2007.
  • [6] C. J. Geyer. On the asymptotics of constrained, M-estimation. Ann. Stat., 22(4) :1993–2010, 1994.
  • [7] M. Grant and S. Boyd. Cvx: Matlab software for disciplined convex programming (web page and software)., http://stanford.edu/~boyd/cvx, june 2009.
  • [8] M. Grant and S. Boyd. Graph implementations for nonsmooth convex programs, recent advances in learning and control (a tribute to m. vidyasagar), v. blondel, s. boyd, and h. kimura, editors, pages 95-110, lecture notes in control and information sciences, springer, 2008.
  • [9] L. Gyorfi, M. Kohler, A. Krzyżak, and H. Walk., A distribution-free theory of nonparametric regression. Springer Series in Statistics. New York, NY: Springer. xvi, 647 p., 2002.
  • [10] R. W. Hill and P. W. Holland. Two robust alternatives to least-squares regression., J. Am. Stat. Assoc., 72:828–833, 1977.
  • [11] J.-B. Hiriart-Urruty and C. Lemaréchal., Convex analysis and minimization algorithms I. Grundlehren der Mathematischen Wissenschaften. 306. Berlin: Springer- Verlag., 1991.
  • [12] P. Huber., Robust Statistics. Wiley, New York, 1981.
  • [13] R. Jennrich. Asymptotic properties of non-linear least squares estimators., Ann. Math. Stat., 40:633–643, 1969.
  • [14] T. Kim and C. Muller. Two stage Huber estimation., Journal of statistical planning and inference, pages 405–418, 2007.
  • [15] K. Knight. Epi-convergence in distribution and stochastic equi-semicontinuity. In, Corpus-based work, pages 33–50, 1997.
  • [16] K. Knight and W. Fu. Asymptotics for Lasso-type estimators., Ann. Stat., 28(5) :1356–1378, 2000.
  • [17] R. Koenker., Quantile regression. Econometric Society Monographs 38. Cambridge: Cambridge University Press. xv, 349 p., 2005.
  • [18] C. Leng, Y. Lin, and G. Wahba. A note on the Lasso and related procedures in model selection., Stat. Sin., 16(4) :1273–1284, 2006.
  • [19] L. McLinden and R. C. Bergstrom. Preservation of convergence of convex sets and functions in finite dimensions., Trans. Am. Math. Soc., 268:127–142, 1981.
  • [20] N. Meinshausen and P. Buhlmann. High-dimensional graphs and variable selection with the Lasso., Ann. Stat., 34(3) :1436–1462, 2006.
  • [21] M. Osborne, B. Presnell, and B. Turlach. On the lasso and its dual., journal of Computational and Graphical Statistics, 9:319–337, 2000.
  • [22] A. B. Owen. A robust hybrid of lasso and ridge regression. Technical report, 2006.
  • [23] G. C. Pflug. Asymptotic dominance and confidence for solutions of stochastic programs., Czech. J. Oper. Res., 1(1):21–30, 1992.
  • [24] D. Pollard. Asymptotics for least absolute deviation regression estimators., Econometric Theory, 7:186–199, 1991.
  • [25] W. J. Rey., Introduction to robust and quasi-robust statistical methods. Universitext. Berlin etc.: Springer-Verlag. IX, 236 p. DM 36.00; $ 14.00, 1983.
  • [26] R. Rockafellar., Convex analysis. Princeton Landmarks in Mathematics. Princeton, NJ: Princeton University Press., 1970.
  • [27] R. Rockafellar and R. J.-B. Wets., Variational analysis. Grundlehren der Mathematischen Wissenschaften., 1998.
  • [28] P. J. Rousseeuw and C. Croux. Alternatives to the median absolute deviation., J. Am. Stat. Assoc., 88(424) :1273–1283, 1993.
  • [29] S. Sardy, P. Tseng, and A. Bruce. Robust wavelet denoising., Signal Processing, IEEE Transactions on [see also Acoustics, Speech, and Signal Processing, IEEE Transactions on], 49(6) :1146–1152, 2001.
  • [30] G. Schwarz. Estimating the dimension of a model., Ann. Stat., 6:461–464, 1978.
  • [31] J. F. Sturm. Using SeDuMi 1. 02, a MATLAB toolbox for optimization over symmetric cones., 1999.
  • [32] R. Tibshirani. Regression shrinkage and selection via the lasso., Journal of the Royal Statistical Society, Series B, 58:267–288, 1996.
  • [33] A. Van der Vaart., Asymptotic statistics. Cambridge Series in Statistical and Probabilistic Mathematics, 3. Cambridge, 1998.
  • [34] A. van der Vaart and J. A. Wellner., Weak convergence and empirical processes. With applications to statistics. Springer Series in Statistics. New York, NY: Springer., 1996.
  • [35] H. Wang, and C. Leng. Unified Lasso Estimation via Least Squares Approximation., J. Am. Stat. Assoc., 102 :1039–1048, 2007.
  • [36] H. Wang, G. Li, and G. Jiang. Robust regression shrinkage and consistent variable selection through the LAD-Lasso., Journal of Business & Economic Statistics, 25(3):347–355, 2007.
  • [37] H. Wang, R. Li, and C. Tsai. Tuning parameter selectors for the smoothly clipped absolute deviation method., Biometrika, 94(3):553–568, 2007.
  • [38] P. Zhao and B. Yu. On Model Selection Consistency of Lasso., Technical report, University of California, Berkeley. Dept. of Statistics, 2006.
  • [39] H. Zou. The Adaptive Lasso and Its Oracle Properties., Journal of the American Statistical Association, 101(476) :1418–1429, 2006.