Electronic Journal of Statistics

Estimation of a distribution from data with small measurement errors

Ann-Kathrin Bott, Luc Devroye, and Michael Kohler

Full-text: Open access

Abstract

In this paper we study the problem of estimation of a distribution from data that contain small measurement errors. The only assumption on these errors is that the average absolute measurement error converges to zero for sample size tending to infinity with probability one. In particular we do not assume that the measurement errors are independent with expectation zero. Throughout the paper we assume that the distribution, which has to be estimated, has a density with respect to the Lebesgue-Borel measure.

We show that the empirical measure based on the data with measurement error leads to an uniform consistent estimate of the distribution function. Furthermore, we show that in general no estimate is consistent in the total variation sense for all distributions under the above assumptions. However, in case that the average measurement error converges to zero faster than a properly chosen sequence of bandwidths, the total variation error of the distribution estimate corresponding to a kernel density estimate converges to zero for all distributions. In case of a general additive error model we show that this result even holds if only the average measurement error converges to zero. The results are applied in the context of estimation of the density of residuals in a random design regression model, where the residual error is not independent from the predictor.

Article information

Source
Electron. J. Statist., Volume 7 (2013), 2457-2476.

Dates
First available in Project Euclid: 2 October 2013

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1380719362

Digital Object Identifier
doi:10.1214/13-EJS850

Mathematical Reviews number (MathSciNet)
MR3117103

Zentralblatt MATH identifier
1293.62068

Subjects
Primary: 62G05: Estimation
Secondary: 62G20: Asymptotic properties

Keywords
Density estimation distribution estimation total variation error $L_{1}$ error measurement errors nonparametric regression residuals universal consistency

Citation

Bott, Ann-Kathrin; Devroye, Luc; Kohler, Michael. Estimation of a distribution from data with small measurement errors. Electron. J. Statist. 7 (2013), 2457--2476. doi:10.1214/13-EJS850. https://projecteuclid.org/euclid.ejs/1380719362


Export citation

References

  • [1] Ahmad, I. A. (1992). Residuals density estimation in nonparametric regression., Statistics and Probability Letters, 14, pp. 133–139.
  • [2] Akritas, M. G. and Van Keilegom, I. (2001). Non-parametric estimation of the residual distribution., Board of the Foundation of the Scandinavian Journal of Statistics, Blackwell Publishers Ltd, 28, pp. 549–567.
  • [3] Cheng, F. (2002). Consistency of error density and distribution function estimators in nonparametric regression., Statistics and Probability Letters, 59, pp. 257–270.
  • [4] Cheng, F. (2004). Weak and strong uniform consistency of a kernel error density estimator in nonparametric regression., Journal of Statistical Planning and Inference, 119, pp. 95–107.
  • [5] Devroye, L. (1983). The equivalence in L1 of weak, strong and complete convergence of kernel density estimates., Annals of Statistics, 11, pp. 896–904.
  • [6] Devroye, L. (1987). A Course in Density Estimation., Birkhäuser, Basel.
  • [7] Devroye, L., Felber, T. and Kohler, M. (2013). Estimation of a density using real and artificial data., IEEE Transactions on Information Theory, 59, pp. 1917–1928.
  • [8] Devroye, L., Felber, T., Kohler, M. and Krzyżak, A. (2012). $L_1$-consistent estimation of the density of residuals in random design regression models., Statistics and Probability Letters, 82, pp. 173–179.
  • [9] Devroye, L. and Györfi, L. (1985). Nonparametric Density Estimation. The L1 view., Wiley Series in Probability and Mathematical Statistics: Tracts on Probability and Statistics. John Wiley and Sons, New York.
  • [10] Devroye, L. and Györfi, L. (1990). No empirical probability measure can converge in the total variation sense for all distributions., Annals of Statistics, 18, pp. 1496–1499.
  • [11] Devroye, L., Györfi, L., and Lugosi, G. (1996)., A Probabilistic Theory of Pattern Recognition. Springer, 1996.
  • [12] Devroye, L. and Lugosi, G. (2000). Combinatorial Methods in Density Estimation. Springer-Verlag, New, York.
  • [13] Devroye, L. and Wagner, T. J. (1980). Distribution-free consistency results in nonparametric discrimination and regression function estimation., Annals of Statistics, 8, pp. 231–239.
  • [14] Durbin, J. (1973). Weak convergence of the sample distribution function when parameters are estimated., Annals of Statistics, 1, pp. 279–290.
  • [15] Efromovich, S. (2005). Estimation of the density of regression errors., Annals of Statistics, 33, pp. 2194–2227.
  • [16] Efromovich, S. (2006). Optimal nonparametric estimation of the density of regression errors with finite support., AISM, 59, pp. 617–654.
  • [17] Györfi, L. (1981). Recent results on nonparametric regression estimate and multiple classification., Problems of Control and Information Theory, 10, pp. 43–52.
  • [18] Györfi, L., Kohler, M., Krzyżak, A., and Walk, H. (2002). A Distribution-Free Theory of Nonparametric Regression. Springer-Verlag, New, York.
  • [19] Györfi, L. and Walk, H. (2012). Strongly consistent density estimation of regression residuals., Statistics and Probability Letters, 82, pp. 1923–1929.
  • [20] Györfi, L. and Walk, H. (2013). Rate of convergence of the density estimation of regression residual., Statistics and Risk Modeling, 30, pp. 55–73.
  • [21] Jacod, J. and Protter, P. E. (2000)., Probability essentials. Universitext – Springer-Verlag, Berlin Heidelberg.
  • [22] Kohler, M. and Krzyżak, A. (2001). Nonparametric regression estimation using penalized least squares., IEEE Transactions on Information Theory, 47, pp. 3054–3058.
  • [23] Loynes, R. M. (1980). The empirical sample distribution function of residuals from generalized regression., Annals of Statistics, 8, pp. 285–298.
  • [24] Lugosi, G. and Zeger, K. (1995). Nonparametric estimation via empirical risk minimization., IEEE Transactions on Information Theory, 41, pp. 677–687.
  • [25] McDiarmid, C. (1989). On the method of bounded differences., Surveys in Combinatorics 1989, vol. 141, pp. 148–188, London Mathematical Society Lecture Notes Series, Cambridge University Press, Cambridge.
  • [26] Meister, A. (2009)., Deconvolution Problems in Nonparametric Statistics. Lecture Notes in Statistics, Vol. 193, Springer.
  • [27] Mnatsakanov, R. M., and Khmaladze, E. V. (1981). On $L_1$-convergence of statistical kernel estimators of distribution densities., Soviet Mathematics Doklady, 23, pp. 633–636.
  • [28] Neumeyer, N. and Van Keilegom, I. (2010). Estimating the error distribution in nonparametric multiple regression with applications to model testing., Journal of Multivariate Analysis, 101, pp. 1067–1078.
  • [29] Parzen, E. (1962). On the estimation of a probability density function and the mode., Annals of Mathematical Statistics, 33, pp. 1065–1076.
  • [30] Rosenblatt, M. (1956). Remarks on some nonparametric estimates of a density function., Annals of Mathematical Statistics, 27, pp. 832–837.
  • [31] Stone, C. J. (1977). Consistent nonparametric regression., Annals of Statistics, 5, pp. 595–645.