The Annals of Statistics

Nonparametric modal regression

Yen-Chi Chen, Christopher R. Genovese, Ryan J. Tibshirani, and Larry Wasserman

Full-text: Access denied (no subscription detected) We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Modal regression estimates the local modes of the distribution of $Y$ given $X=x$, instead of the mean, as in the usual regression sense, and can hence reveal important structure missed by usual regression methods. We study a simple nonparametric method for modal regression, based on a kernel density estimate (KDE) of the joint distribution of $Y$ and $X$. We derive asymptotic error bounds for this method, and propose techniques for constructing confidence sets and prediction sets. The latter is used to select the smoothing bandwidth of the underlying KDE. The idea behind modal regression is connected to many others, such as mixture regression and density ridge estimation, and we discuss these ties as well.

Article information

Source
Ann. Statist. Volume 44, Number 2 (2016), 489-514.

Dates
Received: December 2014
Revised: August 2015
First available in Project Euclid: 17 March 2016

Permanent link to this document
http://projecteuclid.org/euclid.aos/1458245725

Digital Object Identifier
doi:10.1214/15-AOS1373

Mathematical Reviews number (MathSciNet)
MR3476607

Zentralblatt MATH identifier
1338.62113

Subjects
Primary: 62G08: Nonparametric regression
Secondary: 62G20: Asymptotic properties 62G05: Estimation

Keywords
Nonparametric regression modes mixture model confidence set prediction set bootstrap

Citation

Chen, Yen-Chi; Genovese, Christopher R.; Tibshirani, Ryan J.; Wasserman, Larry. Nonparametric modal regression. Ann. Statist. 44 (2016), no. 2, 489--514. doi:10.1214/15-AOS1373. http://projecteuclid.org/euclid.aos/1458245725.


Export citation

References

  • Arias-Castro, E., Mason, D. and Pelletier, B. (2013). On the estimation of the gradient lines of a density and the consistency of the mean-shift algorithm. Unpublished Manuscript.
  • Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer, New York.
  • Carreira-Perpiñán, M. Á. (2007). Gaussian mean-shift is an em algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 29 0767–0776.
  • Chaganty, A. T. and Liang, P. (2013). Spectral experts for estimating mixtures of linear regressions. In Proceedings of the 30th International Conference on Machine Learning (ICML-13) 1040–1048. ACM, New York.
  • Chen, Y.-C., Genovese, C. R. and Wasserman, L. (2014a). Enhanced mode clustering. Available at arXiv:1406.1780.
  • Chen, Y.-C., Genovese, C. R. and Wasserman, L. (2014b). Generalized mode and ridge estimation. Available at arXiv:1406.1803.
  • Chen, Y.-C., Genovese, C. R. and Wasserman, L. (2015). Asymptotic theory for density ridges. Ann. Statist. 43 1896–1928.
  • Chen, Y.-C., Genovese, C. R., Tibshirani, R. J. and Wasserman, L. (2015). Supplement to “Nonparametric modal regression.” DOI:10.1214/15-AOS1373SUPP.
  • Cheng, Y. (1995). Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17 790–799.
  • Chernozhukov, V., Chetverikov, D. and Kato, K. (2014a). Anti-concentration and honest, adaptive confidence bands. Ann. Statist. 42 1787–1818.
  • Chernozhukov, V., Chetverikov, D. and Kato, K. (2014b). Gaussian approximation of suprema of empirical processes. Ann. Statist. 42 1564–1597.
  • Comaniciu, D. and Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24 603–619.
  • Eberly, D. (1996). Ridges in Image and Data Analysis. Springer, Berlin.
  • Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Ann. Statist. 7 1–26.
  • Einbeck, J. and Tutz, G. (2006). Modelling beyond regression functions: An application of multimodal regression to speed-flow data. J. Roy. Statist. Soc. Ser. C 55 461–475.
  • Einmahl, U. and Mason, D. M. (2005). Uniform in bandwidth consistency of kernel-type function estimators. Ann. Statist. 33 1380–1403.
  • Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2014). Nonparametric ridge estimation. Ann. Statist. 42 1511–1545.
  • Giné, E. and Guillou, A. (2002). Rates of strong uniform consistency for multivariate kernel density estimators. Ann. Inst. Henri Poincaré Probab. Stat. 38 907–921.
  • Huang, M., Li, R. and Wang, S. (2013). Nonparametric mixture of regression models. J. Amer. Statist. Assoc. 108 929–941.
  • Huang, M. and Yao, W. (2012). Mixture of regression models with varying mixing proportions: A semiparametric approach. J. Amer. Statist. Assoc. 107 711–724.
  • Hunter, D. R. and Young, D. S. (2012). Semiparametric mixtures of regressions. J. Nonparametr. Stat. 24 19–38.
  • Hyndman, R. J., Bashtannyk, D. M. and Grunwald, G. K. (1996). Estimating and visualizing conditional densities. J. Comput. Graph. Statist. 5 315–336.
  • Jacobs, R. A., Jordan, M. I., Nowlan, S. J. and Hinton, G. E. (1991). Adaptive mixtures of local experts. Neural Comput. 3 79–87. ISSN 0899-7667. Available at http://dx.doi.org/10.1162/neco.1991.3.1.79.
  • Jiang, W. and Tanner, M. A. (1999). Hierarchical mixtures-of-experts for exponential family regression models: Approximation and maximum likelihood estimation. Ann. Statist. 27 987–1011.
  • Khalili, A. and Chen, J. (2007). Variable selection in finite mixture of regression models. J. Amer. Statist. Assoc. 102 1025–1038.
  • Lee, M.-j. (1989). Mode regression. J. Econometrics 42 337–349.
  • Li, J., Ray, S. and Lindsay, B. G. (2007). A nonparametric statistical approach to clustering via mode identification. J. Mach. Learn. Res. 8 1687–1723.
  • Rojas, A. (2005). Nonparametric mixture regression. Ph.D. thesis, Carnegie Mellon Univ., Pittsburgh, PA.
  • Romano, J. P. (1988). On weak convergence and optimality of kernel density estimates of the mode. Ann. Statist. 16 629–647.
  • Sager, T. W. and Thisted, R. A. (1982). Maximum likelihood estimation of isotonic modal regression. Ann. Statist. 10 690–707.
  • Scott, D. W. (1992). Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley, New York.
  • Viele, K. and Tong, B. (2002). Modeling with mixtures of linear regressions. Stat. Comput. 12 315–330.
  • Yao, W. (2013). A note on EM algorithm for mixture models. Statist. Probab. Lett. 83 519–526.
  • Yao, W. and Li, L. (2014). A new regression model: Modal linear regression. Scand. J. Stat. 41 656–671.
  • Yao, W. and Lindsay, B. G. (2009). Bayesian mixture labeling by highest posterior density. J. Amer. Statist. Assoc. 104 758–767.
  • Yao, W., Lindsay, B. G. and Li, R. (2012). Local modal regression. J. Nonparametr. Stat. 24 647–663.

Supplemental materials

  • Supplementary Proofs: Nonparametric modal regression. This document contains all proofs to the theorems and lemmas in this paper.