The Annals of Statistics

Approximation by log-concave distributions, with applications to regression

Lutz Dümbgen, Richard Samworth, and Dominic Schuhmacher

Full-text: Open access

Abstract

We study the approximation of arbitrary distributions P on d-dimensional space by distributions with log-concave density. Approximation means minimizing a Kullback–Leibler-type functional. We show that such an approximation exists if and only if P has finite first moments and is not supported by some hyperplane. Furthermore we show that this approximation depends continuously on P with respect to Mallows distance D1(⋅, ⋅). This result implies consistency of the maximum likelihood estimator of a log-concave density under fairly general conditions. It also allows us to prove existence and consistency of estimators in regression models with a response Y=μ(X)+ε, where X and ε are independent, μ(⋅) belongs to a certain class of regression functions while ε is a random error with log-concave density and mean zero.

Article information

Source
Ann. Statist., Volume 39, Number 2 (2011), 702-730.

Dates
First available in Project Euclid: 9 March 2011

Permanent link to this document
https://projecteuclid.org/euclid.aos/1299680952

Digital Object Identifier
doi:10.1214/10-AOS853

Mathematical Reviews number (MathSciNet)
MR2816336

Zentralblatt MATH identifier
1216.62023

Subjects
Primary: 62E17: Approximations to distributions (nonasymptotic) 62G05: Estimation 62G07: Density estimation 62G08: Nonparametric regression 62G35: Robustness 62H12: Estimation

Keywords
Convex support isotonic regression linear regression Mallows distance projection weak semicontinuity

Citation

Dümbgen, Lutz; Samworth, Richard; Schuhmacher, Dominic. Approximation by log-concave distributions, with applications to regression. Ann. Statist. 39 (2011), no. 2, 702--730. doi:10.1214/10-AOS853. https://projecteuclid.org/euclid.aos/1299680952


Export citation

References

  • Anderson, T. W. (1955). The integral of a symmetric unimodal function over a symmetric convex set and some probability inequalities. Proc. Amer. Math. Soc. 6 170–176.
  • Bagnoli, M. and Bergstrom, T. (2005). Log-concave probability and its applications. Econometric Theory 26 445–469.
  • Balabdaoui, F., Rufibach, K. and Wellner, J. A. (2009). Limit distribution theory for maximum likelihood estimation of a log-concave density. Ann. Statist. 37 1299–1331.
  • Barlow, R. E., Bartholomew, D. J., Bremner, J. M. and Brunk, H. D. (1972). Statistical Inference Under Order Restrictions. The Theory and Application of Isotonic Regression. Wiley, London.
  • Bickel, P. J. and Freedman, D. A. (1981). Some asymptotic theory for the bootstrap. Ann. Statist. 9 1196–1217.
  • Cule, M. L. and Samworth, R. J. (2010). Theoretical properties of the log-concave maximum likelihood estimator of a multidimensional density. Electronic J. Statist. 4 254–270.
  • Cule, M. L., Samworth, R. J. and Stewart, M. I. (2010). Maximum likelihood estimation of a multi-dimensional log-concave density (with discussion). J. Roy. Statist. Soc. Ser. B 72 545–607.
  • Doksum, K., Ozeki, A., Kim, J. and Neto, E. C. (2007). Thinking outside the box: Statistical inference based on Kullback–Leibler empirical projections. Statist. Probab. Lett. 77 1201–1213.
  • Donoho, D. L. and Gasko, M. (1992). Breakdown properties of location estimates based on halfspace depth and projected outlyingness. Ann. Statist. 20 1803–1827.
  • Dümbgen, L., Hüsler, A. and Rufibach, K. (2007). Active set and EM algorithms for log-concave densities based on complete and censored data. Technical Report 61, IMSV, Univ. Bern. Available at http://arxiv.org/abs/0707.4643.
  • Dümbgen, L. and Rufibach, K. (2009). Maximum likelihood estimation of a log-concave density and its distribution function: Basic properties and uniform consistency. Bernoulli 15 40–68.
  • Dümbgen, L., Samworth, R. and Schuhmacher, D. (2010). Approximation by log-concave distributions with applications to regression. Technical Report 75, IMSV, Univ. Bern. Available at http://arxiv.org/abs/1002.3448.
  • Grenander, U. (1956). On the theory of mortality measurement. II. Skand. Aktuarietidskr. 39 125–153.
  • Kantorovič, L. V. and Rubinšteĭn, G. Š. (1958). On a space of completely additive functions. Vestnik Leningrad. Univ. 13 52–59.
  • Koenker, R. and Mizera, I. (2010). Quasi-convex density estimation. Ann. Statist. 38 2998–3027.
  • Mallows, C. L. (1972). A note on asymptotic joint normality. Ann. Math. Statist. 43 508–515.
  • Pal, J., Woodroofe, M. and Meyer, M. (2007). Estimating a Polya frequency function2. In Complex Datasets and Inverse Problems: Tomography, Networks and Beyond ( R. Liu, W. Strawderman and C. H. Zhang, eds.). IMS Lecture Notes and Monograph Series 54 239–249. IMS, Beachwood, OH.
  • Patilea, V. (2001). Convex models, MLE and misspecification. Ann. Statist. 29 94–123.
  • Pfanzagl, J. (1990). Large deviation probabilities for certain nonparametric maximum likelihood estimators. Ann. Statist. 18 1868–1877.
  • Pollard, D. (1990). Empirical Processes: Theory and Applications. NSF-CBMS Regional Conference Series in Probability and Statistics 2. IMS, Hayward, CA.
  • Price, K., Storn, R. and Lampinen, J. (2005). Differential Evolution: A Practical Approach to Global Optimization. Springer, Berlin.
  • Rufibach, K. (2006). Log-concave density estimation and bump hunting for i.i.d. observations. Ph.D. thesis, Dept. Mathematics and Statistics, Univ. Bern.
  • Schuhmacher, D. and Dümbgen, L. (2010). Consistency of multivariate log-concave density estimators. Statist. Probab. Lett. 80 376–380.
  • Schuhmacher, D., Hüsler, A. and Dümbgen, L. (2009). Multivariate log-concave distributions as a nearly parametric model. Technical Report 74, IMSV, Univ. Bern. Available at http://arxiv.org/abs/0907.0250.
  • Seregin, A. and Wellner, J. A. (2010). Nonparametric estimation of multivariate convex-transformed densities. Ann. Statist. 38 3751–3781.
  • van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes, with Applications to Statistics. Springer, New York.
  • Villani, C. (2003). Topics in Optimal Transportation. Graduate Studies in Mathematics 58. Amer. Math. Soc., Providence, RI.
  • Walther, G. (2009). Inference and modeling with log-concave distributions. Statist. Sci. 24 319–327.