Open Access
2017 Prediction by quantization of a conditional distribution
Jean-Michel Loubes, Bruno Pelletier
Electron. J. Statist. 11(1): 2679-2706 (2017). DOI: 10.1214/17-EJS1296

Abstract

Given a pair of random vectors $(X,Y)$, we consider the problem of approximating $Y$ by $\mathbf{c}(X)=\{\mathbf{c}_{1}(X),\dots ,\mathbf{c}_{M}(X)\}$ where $\mathbf{c}$ is a measurable set-valued function. We give meaning to the approximation by using the principles of vector quantization which leads to the definition of a multifunction regression problem. The formulated problem amounts at quantizing the conditional distributions of $Y$ given $X$. We propose a nonparametric estimate of the solutions of the multifunction regression problem by combining the method of $M$-means clustering with the nonparametric smoothing technique of $k$-nearest neighbors. We provide an asymptotic analysis of the estimate and we derive a convergence rate for the excess risk of the estimate. The proposed methodology is illustrated on simulated examples and on a speed-flow traffic data set emanating from the context of road traffic forecasting.

Citation

Download Citation

Jean-Michel Loubes. Bruno Pelletier. "Prediction by quantization of a conditional distribution." Electron. J. Statist. 11 (1) 2679 - 2706, 2017. https://doi.org/10.1214/17-EJS1296

Information

Received: 1 February 2017; Published: 2017
First available in Project Euclid: 27 June 2017

zbMATH: 1366.62098
MathSciNet: MR3679906
Digital Object Identifier: 10.1214/17-EJS1296

Subjects:
Primary: 62G20
Secondary: 62G08

Keywords: $k$-means , clustering , multifunction , nonparametric statistics , regression analysis , set-valued function , Vector quantization

Vol.11 • No. 1 • 2017
Back to Top