Annales de l'Institut Henri Poincaré, Probabilités et Statistiques

Adaptive Dantzig density estimation

K. Bertin, E. Le Pennec, and V. Rivoirard

Full-text: Open access

Abstract

The aim of this paper is to build an estimate of an unknown density as a linear combination of functions of a dictionary. Inspired by Candès and Tao’s approach, we propose a minimization of the 1-norm of the coefficients in the linear combination under an adaptive Dantzig constraint coming from sharp concentration inequalities. This allows to consider a wide class of dictionaries. Under local or global structure assumptions, oracle inequalities are derived. These theoretical results are transposed to the adaptive Lasso estimate naturally associated to our Dantzig procedure. Then, the issue of calibrating these procedures is studied from both theoretical and practical points of view. Finally, a numerical study shows the significant improvement obtained by our procedures when compared with other classical procedures.

Résumé

L’objectif de cet article est de construire un estimateur d’une densité inconnue comme combinaison linéaire de fonctions d’un dictionnaire. Inspirés par l’approche de Candès et Tao, nous proposons une minimisation de la norme 1 des coefficients dans la combinaison linéaire sous une contrainte de Dantzig adaptative issue d’inégalités de concentration précises. Ceci nous permet de considérer une large classe de dictionnaires. Sous des hypothèses de structure locale ou globale, nous obtenons des inégalités oracles. Ces résultats théoriques sont transposés à l’estimateur Lasso adaptatif naturellement associé à notre procédure de Dantzig. Le problème de la calibration de ces procédures est alors étudié à la fois du point de vue théorique et du point de vue pratique. Enfin, une étude numérique montre l’amélioration significative obtenue par notre procédure en comparaison d’autres procédures plus classiques.

Article information

Source
Ann. Inst. H. Poincaré Probab. Statist. Volume 47, Number 1 (2011), 43-74.

Dates
First available in Project Euclid: 4 January 2011

Permanent link to this document
https://projecteuclid.org/euclid.aihp/1294170229

Digital Object Identifier
doi:10.1214/09-AIHP351

Mathematical Reviews number (MathSciNet)
MR2779396

Zentralblatt MATH identifier
1207.62077

Subjects
Primary: 62G07: Density estimation 62G05: Estimation 62G20: Asymptotic properties

Keywords
Calibration Concentration inequalities Dantzig estimate Density estimation Dictionary Lasso estimate Oracle inequalities Sparsity

Citation

Bertin, K.; Le Pennec, E.; Rivoirard, V. Adaptive Dantzig density estimation. Ann. Inst. H. Poincaré Probab. Statist. 47 (2011), no. 1, 43--74. doi:10.1214/09-AIHP351. https://projecteuclid.org/euclid.aihp/1294170229


Export citation

References

  • [1] S. Arlot and P. Massart. Data-driven calibration of penalties for least-squares regression. J. Mach. Learn. Res. 10 (2009) 245–279.
  • [2] M. S. Asif and J. Romberg. Dantzig selector homotopy with dynamic measurements. In Proceedings of SPIE Computational Imaging VII 7246 (2009) 72460E.
  • [3] P. Bickel, Y. Ritov and A. Tsybakov. Simultaneous analysis of Lasso and Dantzig selector. Ann. Statist. 37 (2009) 1705–1732.
  • [4] L. Birgé. Model selection for density estimation with L2-loss, 2008. Available at arXiv 0808.1416.
  • [5] L. Birgé and P. Massart. Minimal penalties for Gaussian model selection. Probab. Theory Related. Fields 138 (2007) 33–73.
  • [6] F. Bunea, A. Tsybakov and M. Wegkamp. Aggregation and sparsity via 1 penalized least squares. In Learning Theory 379–391. Lecture Notes in Comput. Sci. 4005. Springer, Berlin, 2006.
  • [7] F. Bunea, A. Tsybakov and M. Wegkamp. Sparse density estimation with 1 penalties. Learning Theory 530–543. Lecture Notes in Comput. Sci. 4539. Springer, Berlin, 2007.
  • [8] F. Bunea, A. Tsybakov and M. Wegkamp. Sparsity oracle inequalities for the LASSO. Electron. J. Statist. 1 (2007) 169–194.
  • [9] F. Bunea, A. Tsybakov and M. Wegkamp. Aggregation for Gaussian regression. Ann. Statist. 35 (2007) 1674–1697.
  • [10] F. Bunea, A. Tsybakov, M. Wegkamp and A. Barbu. Spades and mixture models. Ann. Statist. (2010). To appear. Available at arXiv 0901.2044.
  • [11] F. Bunea. Consistent selection via the Lasso for high dimensional approximating regression models. In Pushing the Limits of Contemporary Statistics: Cartributions in Honor of J. K. Ghosh 122–137. Inst. Math. Stat. Collect 3. IMS, Beachwood, OH, 2008.
  • [12] E. Candès and Y. Plan. Near-ideal model selection by 1 minimization. Ann. Statist. 37 (2009) 2145–2177.
  • [13] E. Candès and T. Tao. The Dantzig selector: Statistical estimation when p is much larger than n. Ann. Statist. 35 (2007) 2313–2351.
  • [14] D. Chen, D. Donoho and M. Saunders. Atomic decomposition by basis pursuit. SIAM Rev. 43 (2001) 129–159.
  • [15] D. Donoho, M. Elad and V. Temlyakov. Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Trans. Inform. Theory 52 (2006) 6–18.
  • [16] D. Donoho and I. Johnstone. Ideal spatial adaptation via wavelet shrinkage. Biometrika 81 (1994) 425–455.
  • [17] B. Efron, T. Hastie, I. Johnstone and R. Tibshirani. Least angle regression. Ann. Statist. 32 (2004) 407–499.
  • [18] A. Juditsky and S. Lambert-Lacroix. On minimax density estimation on ℝ. Bernoulli 10 (2004) 187–220.
  • [19] K. Knight and W. Fu. Asymptotics for Lasso-type estimators. Ann. Statist. 28 (2000) 1356–1378.
  • [20] K. Lounici. Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators. Electron. J. Stat. 2 (2008) 90–102.
  • [21] P. Massart. Concentration inequalities and model selection. Lecture Notes in Math. 1896. Springer, Berlin. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour July 6–23 2003, 2007.
  • [22] N. Meinshausen and P. Buhlmann. High-dimensional graphs and variable selection with the Lasso. Ann. Statist. 34 (2006) 1436–1462.
  • [23] N. Meinshausen and B. Yu. Lasso-type recovery of sparse representations for high-dimensional data. Ann. Statist. 37 (2009) 246–270.
  • [24] M. Osborne, B. Presnell and B. Turlach. On the Lasso and its dual. J. Comput. Graph. Statist. 9 (2000) 319–337.
  • [25] M. Osborne, B. Presnell and B. Turlach. A new approach to variable selection in least squares problems. IMA J. Numer. Anal. 20 (2000) 389–404.
  • [26] P. Reynaud-Bouret and V. Rivoirard. Near optimal thresholding estimation of a Poisson intensity on the real line. Electron. J. Statist. 4 (2010) 172–238.
  • [27] P. Reynaud-Bouret, V. Rivoirard and C. Tuleau. Adaptive density estimation: A curse of support? 2009. Available at arXiv 0907.1794.
  • [28] R. Tibshirani. Regression shrinkage and selection via the Lasso. J. Roy. Statist. Soc. Ser. B 58 (1996) 267–288.
  • [29] S. van de Geer. High-dimensional generalized linear models and the Lasso. Ann. Statist. 36 (2008) 614–645.
  • [30] B. Yu and P. Zhao. On model selection consistency of Lasso estimators. J. Mach. Learn. Res. 7 (2006) 2541–2567.
  • [31] C. Zhang and J. Huang. The sparsity and bias of the Lasso selection in high-dimensional linear regression. Ann. Statist. 36 (2008) 1567–1594.
  • [32] H. Zou. The adaptive Lasso and its oracle properties. J. Amer. Statist. Assoc. 101 (2006) 1418–1429.