The Annals of Statistics

Large-sample study of the kernel density estimators under multiplicative censoring

Masoud Asgharian, Marco Carone, and Vahid Fakoor

Full-text: Open access

Abstract

The multiplicative censoring model introduced in Vardi [Biometrika 76 (1989) 751–761] is an incomplete data problem whereby two independent samples from the lifetime distribution G, $\mathcal{X}_{m}=(X_{1},\ldots,X_{m})$ and $\mathcal{Z}_{n}=(Z_{1},\ldots,Z_{n})$, are observed subject to a form of coarsening. Specifically, sample $\mathcal{X}_{m}$ is fully observed while $\mathcal{Y}_{n}=(Y_{1},\ldots,Y_{n})$ is observed instead of $\mathcal{Z}_{n}$, where Yi = UiZi and (U1, …, Un) is an independent sample from the standard uniform distribution. Vardi [Biometrika 76 (1989) 751–761] showed that this model unifies several important statistical problems, such as the deconvolution of an exponential random variable, estimation under a decreasing density constraint and an estimation problem in renewal processes. In this paper, we establish the large-sample properties of kernel density estimators under the multiplicative censoring model. We first construct a strong approximation for the process $\sqrt{k}(\hat{G}-G)$, where Ĝ is a solution of the nonparametric score equation based on $(\mathcal{X}_{m},\mathcal{Y}_{n})$, and k = m + n is the total sample size. Using this strong approximation and a result on the global modulus of continuity, we establish conditions for the strong uniform consistency of kernel density estimators. We also make use of this strong approximation to study the weak convergence and integrated squared error properties of these estimators. We conclude by extending our results to the setting of length-biased sampling.

Article information

Source
Ann. Statist., Volume 40, Number 1 (2012), 159-187.

Dates
First available in Project Euclid: 15 March 2012

Permanent link to this document
https://projecteuclid.org/euclid.aos/1331830778

Digital Object Identifier
doi:10.1214/11-AOS954

Mathematical Reviews number (MathSciNet)
MR3013183

Zentralblatt MATH identifier
1246.62094

Subjects
Primary: 62N01: Censored data models
Secondary: 62G07: Density estimation

Keywords
Integrated squared error kernel density estimation length-biased sampling modulus of continuity multiplicative censoring strong approximation

Citation

Asgharian, Masoud; Carone, Marco; Fakoor, Vahid. Large-sample study of the kernel density estimators under multiplicative censoring. Ann. Statist. 40 (2012), no. 1, 159--187. doi:10.1214/11-AOS954. https://projecteuclid.org/euclid.aos/1331830778


Export citation

References

  • [1] Asgharian, M., Carone, M. and Fakoor, V. (2012). Supplement to “Large-sample study of the kernel density estimators under multiplicative censoring.” DOI:10.1214/11-AOS954SUPP.
  • [2] Asgharian, M., M’Lan, C. E. and Wolfson, D. B. (2002). Length-biased sampling with right censoring: An unconditional approach. J. Amer. Statist. Assoc. 97 201–209.
  • [3] Asgharian, M. and Wolfson, D. B. (2005). Asymptotic behavior of the unconditional NPMLE of the length-biased survivor function from right censored prevalent cohort data. Ann. Statist. 33 2109–2131.
  • [4] Asgharian, M., Wolfson, D. B. and Zhang, X. (2006). Checking stationarity of the incidence rate using prevalent cohort survival data. Stat. Med. 25 1751–1767.
  • [5] Bergeron, P.-J., Asgharian, M. and Wolfson, D. B. (2008). Covariate bias induced by length-biased sampling of failure times. J. Amer. Statist. Assoc. 103 737–742.
  • [6] Bickel, P. J., Klaassen, A. J., Ritov, Y. and Wellner, J. A. (1993). Efficient and Adaptive Inference in Semiparametric Models. Johns Hopkins Univ. Press, Baltimore.
  • [7] Bickel, P. J. and Ritov, J. (1991). Large sample theory of estimation in biased sampling regression models. I. Ann. Statist. 19 797–816.
  • [8] Bikel, P. D. and Ritov, I. (1994). Efficient estimation using both direct and indirect observations. Theory Probab. Appl. 38 194–213.
  • [9] Blum, J. R. and Susarla, V. (1980). Maximal deviation theory of density and failure rate function estimates based on censored data. In Multivariate Analysis, V (Proc. Fifth Internat. Sympos., Univ. Pittsburgh, Pittsburgh, PA, 1978) 213–222. North-Holland, Amsterdam.
  • [10] Burke, M. D., Csörgő, S. and Horváth, L. (1981). Strong approximations of some biometric estimates under random censorship. Probab. Theory Related Fields 56 87–112.
  • [11] Burke, M. D., Csörgő, S. and Horváth, L. (1988). A correction to and improvement of: “Strong approximations of some biometric estimates under random censorship.” Probab. Theory Related Fields 79 51–57.
  • [12] Carroll, R. J., van Rooij, A. C. M. and Ruymgaart, F. H. (1991). Theoretical aspects of ill-posed problems in statistics. Acta Appl. Math. 24 113–140.
  • [13] Cox, D. R. (1969). Some sampling problems in technology. In New Developments in Survey Sampling (N. L. Johnson and H. Smith, eds.). Wiley, New York.
  • [14] Cox, D. R. and Oakes, D. (1984). Analysis of Survival Data. Chapman and Hall, London.
  • [15] Csörgő, M. and Révész, P. (1981). Strong Approximations in Probability and Statistics. Academic Press, New York.
  • [16] Csörgő, S. and Hall, P. (1984). The Komlós–Major–Tusnády approximations and their applications. Austral. J. Statist. 26 189–218.
  • [17] del Barrio, E., Deheuvels, P. and van de Geer, S. (2007). Lectures on Empirical Processes: Theory and Statistical Applications. Eur. Math. Soc., Zürich.
  • [18] Fisher, R. A. (1934). The effect of methods of ascertainment upon the estimation of frequencies. Annals of Eugenics 6 13–25.
  • [19] Földes, A., Rejtő, L. and Winter, B. B. (1981). Strong consistency properties of nonparametric estimators for randomly censored data. II. Estimation of density and failure rate. Period. Math. Hungar. 12 15–29.
  • [20] Gilbert, P. B. (2000). Large sample theory of maximum likelihood estimates in semiparametric biased sampling models. Ann. Statist. 28 151–194.
  • [21] Gill, R. D., Vardi, Y. and Wellner, J. A. (1988). Large sample theory of empirical distributions in biased sampling models. Ann. Statist. 16 1069–1112.
  • [22] Grenander, U. (1956). On the theory of mortality measurement. II. Skand. Aktuarietidskr. 39 125–153.
  • [23] Groeneboom, P. (1985). Estimating a monotone density. In Proceedings of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer, Vol. II (Berkeley, CA, 1983) 539–555. Wadsworth, Belmont, CA.
  • [24] Hall, P. (1982). Limit theorems for stochastic measures of the accuracy of density estimators. Stochastic Process. Appl. 13 11–25.
  • [25] Hasminskii, R. Z. and Ibragimov, I. A. (1983). On asymptotic efficiency in the presence of an infinite-dimensional nuisance parameter. In Probability Theory and Mathematical Statistics (Tbilisi, 1982). Lecture Notes in Math. 1021 195–229. Springer, Berlin.
  • [26] Huang, J. and Wellner, J. A. (1995). Estimation of a monotone density or monotone hazard under random censoring. Scand. J. Stat. 22 3–33.
  • [27] Komlós, J., Major, P. and Tusnády, G. (1975). An approximation of partial sums of independent RV’s and the sample DF. I. Z. Wahrsch. Verw. Gebiete 32 111–131.
  • [28] Kvam, P. (2008). Length bias in the measurements of carbon nanotubes. Technometrics 50 462–467.
  • [29] Marron, J. S. and Padgett, W. J. (1987). Asymptotically optimal bandwidth selection for kernel density estimators from randomly right-censored samples. Ann. Statist. 15 1520–1535.
  • [30] Mason, D. M. (2007). Some observations on the KMT dyadic scheme. J. Statist. Plann. Inference 137 895–906.
  • [31] Mason, D. M., Shorack, G. R. and Wellner, J. A. (1983). Strong limit theorems for oscillation moduli of the uniform empirical process. Probab. Theory Related Fields 65 83–97.
  • [32] Mielniczuk, J. (1986). Some asymptotic properties of kernel estimators of a density function in case of censored data. Ann. Statist. 14 766–773.
  • [33] Nadaraja, È. A. (1965). On non-parametric estimates of density functions and regression curves. Theory Probab. Appl. 10 199–203.
  • [34] Neyman, J. (1955). Statistics: Servant of all science. Science 122 401–406.
  • [35] Nussbaum, M. (1996). Asymptotic equivalence of density estimation and Gaussian white noise. Ann. Statist. 24 2399–2430.
  • [36] Parthasarathy, K. R. (2005). Probability Measures on Metric Spaces. AMS Chelsea Publishing, Providence, RI.
  • [37] Révész, P. (1972). On empirical density function. Period. Math. Hungar. 2 85–110.
  • [38] Rosenblatt, M. (1956). Remarks on some nonparametric estimates of a density function. Ann. Math. Statist. 27 832–837.
  • [39] Schuster, E. F. (1969). Estimation of a probability density function and its derivatives. Ann. Math. Statist. 40 1187–1195.
  • [40] Scott, D. W. (1992). Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley, New York.
  • [41] Shorack, G. R. and Wellner, J. A. (1986). Empirical Processes with Applications to Statistics. Wiley, New York.
  • [42] Silverman, B. W. (1978). Weak and strong uniform consistency of the kernel estimate of a density and its derivatives. Ann. Statist. 6 177–184.
  • [43] Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman and Hall, London.
  • [44] Steele, J. M. (1978). Invalidity of average squared error criterion in density estimation. Canad. J. Statist. 6 193–200.
  • [45] van der Vaart, A. (1994). Maximum likelihood estimation with partially censored data. Ann. Statist. 22 1896–1916.
  • [46] Van der Vaart, A. W. (2000). Asymptotic Statistics. Cambridge Univ. Press, New York.
  • [47] Van Ryzin, J. (1969). On strong consistency of density estimates. Ann. Math. Statist. 40 1765–1772.
  • [48] Vardi, Y. (1982). Nonparametric estimation in the presence of length bias. Ann. Statist. 10 616–620.
  • [49] Vardi, Y. (1985). Empirical distributions in selection bias models. Ann. Statist. 13 178–205.
  • [50] Vardi, Y. (1989). Multiplicative censoring, renewal processes, deconvolution and decreasing density: Nonparametric estimation. Biometrika 76 751–761.
  • [51] Vardi, Y. and Zhang, C.-H. (1992). Large sample study of empirical distributions in a random-multiplicative censoring model. Ann. Statist. 20 1022–1039.
  • [52] Wicksell, S. D. (1925). The corpuscle problem. Biometrika 17 84–99.
  • [53] Wolfson, C., Wolfson, D. B., Asgharian, M., M’Lan, C. E., Ostbye, T., Rockwood, K. and Hogan, D. B. et al. (2001). A reevaluation of the duration of survival after the onset of dementia. New England Journal of Medicine 344 1111.
  • [54] Zeidler, E. (1985). Nonlinear Functional Analysis and Its Applications, Part I: Fixed-Point Theorems. Springer, New York.
  • [55] Zelen, M. and Feinleib, M. (1969). On the theory of screening for chronic diseases. Biometrika 56 601–614.
  • [56] Zhang, B. (1998). A note on the integrated square errors of kernel density estimators under random censorship. Stochastic Process. Appl. 75 225–234.

Supplemental materials