Electronic Journal of Statistics

Nonparametric modal regression in the presence of measurement error

Haiming Zhou and Xianzheng Huang

Full-text: Open access

Abstract

In the context of regressing a response $Y$ on a predictor $X$, we consider estimating the local modes of the distribution of $Y$ given $X=x$ when $X$ is prone to measurement error. We propose two nonparametric estimation methods, with one based on estimating the joint density of $(X,Y)$ in the presence of measurement error, and the other built upon estimating the conditional density of $Y$ given $X=x$ using error-prone data. We study the asymptotic properties of each proposed mode estimator, and provide implementation details including the mean-shift algorithm for mode seeking and bandwidth selection. Numerical studies are presented to compare the proposed methods with an existing mode estimation method developed for error-free data naively applied to error-prone data.

Article information

Source
Electron. J. Statist., Volume 10, Number 2 (2016), 3579-3620.

Dates
Received: May 2016
First available in Project Euclid: 24 November 2016

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1479956457

Digital Object Identifier
doi:10.1214/16-EJS1210

Mathematical Reviews number (MathSciNet)
MR3575565

Zentralblatt MATH identifier
1357.62185

Subjects
Primary: 62G08: Nonparametric regression
Secondary: 62G20: Asymptotic properties

Keywords
Bandwidth selection deconvoluting kernel Fourier transform local mode mean-shift algorithm

Citation

Zhou, Haiming; Huang, Xianzheng. Nonparametric modal regression in the presence of measurement error. Electron. J. Statist. 10 (2016), no. 2, 3579--3620. doi:10.1214/16-EJS1210. https://projecteuclid.org/euclid.ejs/1479956457


Export citation

References

  • Bamford, S.P., Rojas, A.L., Genovese, C.R., Miller, C.E., Nichol R., and Wasserman, L. (2008) Revealing components of the galaxy population through nonparametric techniques., Mon. Not. R. Astron. Soc., 391, 607–616.
  • Carroll, R. and Hall, P. (1988) Optimal rates of convergence for deconvoluting a density., J. Am. Statist. Ass., 83, 1184–1186.
  • Carroll, R., Ruppert, D., Stefanski, L.A. and Crainiceanu, C.M. (2006), Measurement error in nonlinear models: A model perspective. Second edition. Chapman & Hall/CRC. Boca Raton, FL.
  • Chacón, J.E., Duong, T. and Wand, M.P. (2011) Asymptotics for general multivariate kernel density derivative estimators., Stat. Sinica, 21, 807–840.
  • Chen, Y., Genovese, C.R., Tibshirani, R.J., and Wasserman, L. (2015) Asymptotic theory for density ridges., Ann. Statist., 43, 1896–1928.
  • Chen, Y., Genovese, C.R., Tibshirani, R.J., and Wasserman, L. (2016) Nonparametric modal regression., Ann. Statist., 44, 489–514.
  • Cheng, Y. (1995) Mean shift, mode seeking, and clustering., IEEE. T. Pattern. Anal., 17, 790–799.
  • Comaniciu, D. and Meer, P. (2002) Mean shift: a robust approach toward feature space analysis., IEEE. T. Pattern. Anal., 24, 603–619.
  • Cook, J.R. and Stefanski, L.A. (1994) Simulation-extrapolation estimation in parametic measurement error models., J. Am. Statist. Ass., 89, 1314–1328.
  • Delaigle, A., Fan, J., and Carroll, R. (2009) A design-adaptive local polynomial estimator for the error-in-variables problem., J. Am. Statist. Ass., 104, 348–359.
  • Delaigle, A. and Hall, P. (2008) Using SIMEX for smoothing-parameter choice in errors-in-variables problems., J. Am. Statist. Ass., 103, 280-287.
  • Delaigle, A., Hall, P., and Meister, A. (2008) On deconvolution with repeated measurements., Ann. Statist., 36, 665–687.
  • Einmahl, U. and Mason, D.M. (2005) Uniform in bandwidth consistency of kernel-type function estimators., Ann. Statist., 33, 1380–1403.
  • Einbeck, J. and Tutz, G. (2006) Modelling beyond regression functions: an application of multimodal regression to space-flow data., J. R. Statist. Soc.B, 55, 461–475.
  • Fan, J. (1991) Asymptotic normality for deconvolution kernel density estimators., Sankhya A, 53, 97–110.
  • Fan, J. (1991) Global behavior of deconvolution kernel estimates., Stat. Sinica, 1, 541–551.
  • Fan, J. (1991) On the optimal rates of convergence for nonparametric deconvolution problems., Ann. Statist., 19, 1257–1272.
  • Fan, J. and Gijbels, I. (1996), Local polynomial modelling and its applications, Chapman and Hall/CRC, Boca Raton.
  • Fan, J. and Truong, Y.K. (1993) Nonparametric regression with errors in variables., Ann. Statist., 21, 1900–1925.
  • Fan, J., Yao, Q., and Tong, H. (1996) Estimation of conditional densities and sensitivity measures in nonlinear dynamical systems., Biometrika, 83, 189–206.
  • Fan, J. and Yim, T.H. (2004) A cross validation method for estimating conditional densities., Biometrika, 91, 819–834.
  • Genovese, C.R., Perone-Pacifico, M., Verdinelli, I., and Wasserman, L. (2014) Nonparametric ridge estimation., Ann. Statist., 42 1511–1545.
  • Ginè, E. and Guillou, A. (2002) Rates of strong uniform consistency for multivariate kernel density estimators., Ann. Inst. H. Poincarè Probab. Statist, 38 907–921.
  • Hall, P., Racine, J., and Li, Q. (2004) Cross-validation and the estimation of conditional probability densities., J. Am. Statist. Ass., 99, 1015–1026.
  • Hastie, T. and Stuezle, S. (1989) Principal curves., J. Am. Statist. Ass., 44, 489–514.
  • He, X., and Liang, H. (2000) Quantile regression estimates for a class of linear and partially linear errors-in-variables models., Statist. Sin., 10, 129–140.
  • Huang, M., Li, R., and Wang, S. (2013) Nonparametric mixture of regression models., J. Am. Statist. Ass., 108, 929–941.
  • Hyndman, R.J., Bashtannyk, D.M. and Grunwald, G.K. (1996) Estimating and visualizing conditional densities., J. Comput. Graph. Stat., 5, 315–336.
  • Koenker, R. (2005), Quantile regression. Cambridge University Press.
  • Liang, H. and Wang, N. (2005) Partially linear single-index measurement error models., Stat. Sinica, 15, 99–116.
  • Ma, Y. and Yin, G. (2011) Censored quantile regression with covariate measurement errors., Stat. Sinica, 21, 949–971.
  • Nakamura, T. (1990) Corrected score functions for errors-in-variables models: methodology and applications to generalized linear models., Biometrika, 77, 127–137.
  • Novick, S.J. and Stefanski, L.A. (2002) Corrected score estimation via complex variable simulation extrapolation., J. Am. Statist. Ass., 97, 472–481.
  • Ozertem, U. and Erdogmus, D. (2011) Locally defined principal curves and surfaces., J. Mach. Learn. Res., 12, 1249–1286.
  • Silverman, B.W. (1986), Density Estimation for Statistics and Data Analysis, Chapman and Hall.
  • Stefanski, L.A. and Carroll, R.J. (1990) Deconvoluting kernel density estimators., Statistics, 21, 169–184.
  • van der Vaart, A.W. and Wellner, J.A. (1996), Weak convergence and empirical process: with applications to statistics. Springer, New York.
  • Wand, M.P. and Jones, M.C. (1993) Comparison of smoothing parameterizations in bivariate kernel density estimation., J. Am. Statist. Ass., 88, 520–528.
  • Wang, H.J., Stefanski, L.A. and Zhu, Z. (2012) Corrected-loss estimation for quantile regression with covariate measurement errors., Biometrika, 99, 405–421.
  • Wei, Y. and Raymond, C.J. (2009) Quantile regression with measurement error., J. Am. Statist. Ass., 104, 1129–1143.
  • Yao, W. and Li L. (2014) A new regression model: modal linear regression., Scand. J. Stat., 41, 656–671.
  • Yao, W., Lindsay, B., and Li, R. (2012) Local modal regression., J. Nonparametr. Stat., 24, 647–663.