Electronic Journal of Statistics

Empirical measures for incomplete data with applications

Shojaeddin Chenouri, Majid Mojirsheibani, and Zahra Montazeri

Full-text: Open access

Abstract

Methods are proposed to construct empirical measures when there are missing terms among the components of a random vector. Furthermore, Vapnik-Chevonenkis type exponential bounds are obtained on the uniform deviations of these estimators, from the true probabilities. These results can then be used to deal with classical problems such as statistical classification, via empirical risk minimization, when there are missing covariates among the data. Another application involves the uniform estimation of a distribution function.

Article information

Source
Electron. J. Statist., Volume 3 (2009), 1021-1038.

Dates
First available in Project Euclid: 13 October 2009

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1255440399

Digital Object Identifier
doi:10.1214/09-EJS420

Mathematical Reviews number (MathSciNet)
MR2557127

Zentralblatt MATH identifier
1326.62206

Subjects
Primary: 60G50: Sums of independent random variables; random walks 62G15: Tolerance and confidence regions
Secondary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]

Keywords
Exponential bounds Vapnik-Chervonenkis distribution function classification consistency

Citation

Chenouri, Shojaeddin; Mojirsheibani, Majid; Montazeri, Zahra. Empirical measures for incomplete data with applications. Electron. J. Statist. 3 (2009), 1021--1038. doi:10.1214/09-EJS420. https://projecteuclid.org/euclid.ejs/1255440399


Export citation

References

  • Bennett, G. (1962). Probability inequalities for the sum of independent random variables., J. Amer. Statist Assoc., 57:33–45.
  • Cheng, P. E. and Chu, C. K. (1996). Kernel estimation of distribution functions and quantiles with missing data., Statist. Sinica, 6:63–78.
  • Devroye, L. (1982). Bounds on the uniform deviation of empirical meaures., Journal of Multivariate Analysis, 12:72–79.
  • Devroye, L., Györfi, L., and Lugosi, G. (1996)., A Probabilistic Theory of Pattern Recognition. Springer-Verlag, New York.
  • Dudley, R. (1978). Central limit theorems for empirical measures., Ann. Probab., 6:899–929.
  • Little, R. J. A. and Rubin, D. B. (2002)., Statistical Analysis With Missing Data. Wiley, New York.
  • Massart, P. (1990). The tight constant in the Devoretzky-Kiefer-Wolfowitz inequality., Ann. Probab., 18:1269–1283.
  • Pollard, D. (1984)., Convergence of Stochastic Processes. Springer-Verlag, New York.
  • Talagrand, M. (1994). Sharper bounds for gaussian and empirical processes., Ann. Probab., 22:28–76.
  • van der Vaart, A. W. and Wellner, J. A. (1996)., Weak Convergence and Empirical Processes, with Applications to Statistics. Springer-Verlag, New York.
  • Vapnik, V. N. and Chervonenkis, A. Y. (1971). On the uniform convergence of relative frequencies of events to their probabilities., Theory Probab. Appl., 16:264–280.