Electronic Journal of Statistics

Conditional density estimation with covariate measurement error

Xianzheng Huang and Haiming Zhou

Full-text: Open access

Abstract

We consider estimating the density of a response conditioning on an error-prone covariate. Motivated by two existing kernel density estimators in the absence of covariate measurement error, we propose a method to correct the existing estimators for measurement error. Asymptotic properties of the resultant estimators under different types of measurement error distributions are derived. Moreover, we adjust bandwidths readily available from existing bandwidth selection methods developed for error-free data to obtain bandwidths for the new estimators. Extensive simulation studies are carried out to compare the proposed estimators with naive estimators that ignore measurement error, which also provide empirical evidence for the effectiveness of the proposed bandwidth selection methods. A real-life data example is used to illustrate implementation of these methods under practical scenarios. An R package, lpme, is developed for implementing all considered methods, which we demonstrate via an R code example in Appendix B.2.

Article information

Source
Electron. J. Statist., Volume 14, Number 1 (2020), 970-1023.

Dates
Received: June 2019
First available in Project Euclid: 20 February 2020

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1582167984

Digital Object Identifier
doi:10.1214/20-EJS1688

Mathematical Reviews number (MathSciNet)
MR4066544

Zentralblatt MATH identifier
07200223

Subjects
Primary: 62G08: Nonparametric regression
Secondary: 62G20: Asymptotic properties

Keywords
Bandwidth bias cross validation deconvoluting kernel

Rights
Creative Commons Attribution 4.0 International License.

Citation

Huang, Xianzheng; Zhou, Haiming. Conditional density estimation with covariate measurement error. Electron. J. Statist. 14 (2020), no. 1, 970--1023. doi:10.1214/20-EJS1688. https://projecteuclid.org/euclid.ejs/1582167984


Export citation

References

  • Billingsley, P. (2008)., Probability and measure. John Wiley & Sons.
  • Buzas, J. S., Stefanski, L. A., and Tosteson, T. D. (2014). Measurement error., Handbook of epidemiology, pages 1241–1282.
  • Carroll, R., Ruppert, D., Stefanski, L., and Crainiceanu, C. (2006)., Measurement error in nonlinear models: a modern perspective, volume 105. Chapman & Hall/CRC.
  • Carroll, R. J. (2014). Measurement error in epidemiologic studies., Wiley StatsRef: Statistics Reference Online.
  • Carroll, R. J. and Hall, P. (1988). Optimal rates of convergence for deconvolving a density., Journal of the American Statistical Association, 83(404):1184–1186.
  • Cook, J. and Stefanski, L. (1994). Simulation-extrapolation estimation in parametric measurement error models., Journal of the American Statistical Association, 89(428):1314–1328.
  • Delaigle, A. (2008). An alternative view of the deconvolution problem., Statistica Sinica, pages 1025–1045.
  • Delaigle, A., Fan, J., and Carroll, R. (2009). A design-adaptive local polynomial estimator for the errors-in-variables problem., Journal of the American Statistical Association, 104(485):348–359.
  • Delaigle, A. and Gijbels, I. (2002). Estimation of integrated squared density derivatives from a contaminated sample., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(4):869–886.
  • Delaigle, A. and Gijbels, I. (2004a). Bootstrap bandwidth selection in kernel density estimation from a contaminated sample., Annals of the Institute of Statistical Mathematics, 56(1):19–47.
  • Delaigle, A. and Gijbels, I. (2004b). Practical bandwidth selection in deconvolution kernel density estimation., Computational statistics & data analysis, 45(2):249–267.
  • Delaigle, A. and Hall, P. (2008). Using SIMEX for smoothing-parameter choice in errors-in-variables problems., Journal of the American Statistical Association, 103(481):280–287.
  • Delaigle, A., Hall, P., and Meister, A. (2008). On deconvolution with repeated measurements., The Annals of Statistics, pages 665–685.
  • Fan, J. (1991a). Asymptotic normality for deconvolution kernel density estimators., Sankhyā: The Indian Journal of Statistics, Series A, pages 97–110.
  • Fan, J. (1991b). Global behavior of deconvolution kernel estimates., Statistica Sinica, pages 541–551.
  • Fan, J. (1991c). On the optimal rates of convergence for nonparametric deconvolution problems., The Annals of Statistics, pages 1257–1272.
  • Fan, J. and Gijbels, I. (1996)., Local Polynomial Modelling and Its Applications: Monographs on Statistics and Applied Probability 66, volume 66. Chapman & Hall/CRC.
  • Fan, J., Yao, Q., and Tong, H. (1996). Estimation of conditional densities and sensitivity measures in nonlinear dynamical systems., Biometrika, 83(1):189–206.
  • Fan, J. and Yim, T. H. (2004). A crossvalidation method for estimating conditional densities., Biometrika, 91(4):819–834.
  • Fuller, W. A. (2009)., Measurement error models, volume 305. John Wiley & Sons.
  • Hall, P., Racine, J., and Li, Q. (2004). Cross-validation and the estimation of conditional probability densities., Journal of the American Statistical Association, 99(468):1015–1026.
  • Huang, X. and Zhou, H. (2017). An alternative local polynomial estimator for the error-in-variables problem., Journal of Nonparametric Statistics, 29(2):301–325.
  • Hyndman, R. J., Bashtannyk, D. M., and Grunwald, G. K. (1996). Estimating and visualizing conditional densities., Journal of Computational and Graphical Statistics, 5(4):315–336.
  • Hyndman, R. J. and Yao, Q. (2002). Nonparametric estimation and symmetry tests for conditional density functions., Journal of nonparametric statistics, 14(3):259–278.
  • Jones, M. C., Marron, J. S., and Sheather, S. J. (1996). A brief survey of bandwidth selection for density estimation., Journal of the American Statistical Association, 91(433):401–407.
  • Liang, H. and Wang, N. (2005). Partially linear single-index measurement error models., Statistica Sinica, pages 99–116.
  • Masry, E. (1993). Strong consistency and rates for deconvolution of multivariate densities of stationary processes., Stochastic processes and their applications, 47(1):53–74.
  • Meister, A. (2004). On the effect of misspecifying the error density in a deconvolution problem., Canadian Journal of Statistics, 32(4):439–449.
  • Robins, J. M., Hsieh, F., and Newey, W. (1995). Semiparametric efficient estimation of a conditional density with missing or mismeasured covariates., Journal of the Royal Statistical Society. Series B (Methodological), pages 409–424.
  • Rosenblatt, M. (1969). Conditional probability density and regression estimators., Multivariate analysis II, 25:31.
  • Scott, D. W. (2015)., Multivariate density estimation: theory, practice, and visualization. John Wiley & Sons.
  • Silverman, B. W. (1986)., Density Estimation for Statistics and Data Analysis. Chapman and Hall.
  • Stefanski, L. A. and Carroll, R. J. (1990). Deconvolving kernel density estimators., Statistics: A Journal of Theoretical and Applied Statistics, 21(2):169–184.
  • Stefanski, L. A. and Cook, J. R. (1995). Simulation-extrapolation: the measurement error jackknife., Journal of the American Statistical Association, 90(432):1247–1256.
  • Sugiyama, M., Takeuchi, I., Suzuki, T., Kanamori, T., Hachiya, H., and Okanohara, D. (2010). Conditional density estimation via least-squares density ratio estimation. In, Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pages 781–788.
  • Wang, H. J., Stefanski, L. A., and Zhu, Z. (2012). Corrected-loss estimation for quantile regression with covariate measurement errors., Biometrika, 99(2):405–421.
  • Wang, N., Carroll, R., and Liang, K.-Y. (1996). Quasilikelihood estimation in measurement error models with correlated replicates., Biometrics, pages 401–411.
  • Zhou, H. and Huang, X. (2016). Nonparametric modal regression in the presence of measurement error., Electronic Journal of Statistics, 10(2):3579–3620.
  • Zhou, H. and Huang, X. (2017)., lpme: Nonparametric Estimation of Measurement Error Models. Version 1.1.1.