## The Annals of Statistics

### Vector quantile regression: An optimal transport approach

#### Abstract

We propose a notion of conditional vector quantile function and a vector quantile regression. A conditional vector quantile function (CVQF) of a random vector $Y$, taking values in $\mathbb{R}^{d}$ given covariates $Z=z$, taking values in $\mathbb{R}^{k}$, is a map $u\longmapsto Q_{Y|Z}(u,z)$, which is monotone, in the sense of being a gradient of a convex function, and such that given that vector $U$ follows a reference non-atomic distribution $F_{U}$, for instance uniform distribution on a unit cube in $\mathbb{R}^{d}$, the random vector $Q_{Y|Z}(U,z)$ has the distribution of $Y$ conditional on $Z=z$. Moreover, we have a strong representation, $Y=Q_{Y|Z}(U,Z)$ almost surely, for some version of $U$. The vector quantile regression (VQR) is a linear model for CVQF of $Y$ given $Z$. Under correct specification, the notion produces strong representation, $Y=\beta (U)^{\top}f(Z)$, for $f(Z)$ denoting a known set of transformations of $Z$, where $u\longmapsto\beta(u)^{\top}f(Z)$ is a monotone map, the gradient of a convex function and the quantile regression coefficients $u\longmapsto\beta(u)$ have the interpretations analogous to that of the standard scalar quantile regression. As $f(Z)$ becomes a richer class of transformations of $Z$, the model becomes nonparametric, as in series modelling. A key property of VQR is the embedding of the classical Monge–Kantorovich’s optimal transportation problem at its core as a special case. In the classical case, where $Y$ is scalar, VQR reduces to a version of the classical QR, and CVQF reduces to the scalar conditional quantile function. An application to multiple Engel curve estimation is considered.

#### Article information

Source
Ann. Statist. Volume 44, Number 3 (2016), 1165-1192.

Dates
Revised: September 2015
First available in Project Euclid: 11 April 2016

http://projecteuclid.org/euclid.aos/1460381690

Digital Object Identifier
doi:10.1214/15-AOS1401

Mathematical Reviews number (MathSciNet)
MR3485957

Zentralblatt MATH identifier
06590312

#### Citation

Carlier, Guillaume; Chernozhukov, Victor; Galichon, Alfred. Vector quantile regression: An optimal transport approach. Ann. Statist. 44 (2016), no. 3, 1165--1192. doi:10.1214/15-AOS1401. http://projecteuclid.org/euclid.aos/1460381690.

#### References

• [1] Belloni, A. and Winkler, R. L. (2011). On multivariate quantiles under partial orders. Ann. Statist. 39 1125–1179.
• [2] Brenier, Y. (1991). Polar factorization and monotone rearrangement of vector-valued functions. Comm. Pure Appl. Math. 44 375–417.
• [3] Carlier, G., Chernozhulov, V. and Galichon, A. (2015). Supplement to “Vector quantile regression: An optimal transport approach.” DOI:10.1214/15-AOS1401SUPP.
• [4] Chaudhuri, P. (1996). On a geometric notion of quantiles for multivariate data. J. Amer. Statist. Assoc. 91 862–872.
• [5] Chernozhukov, V., Fernández-Val, I. and Galichon, A. (2010). Quantile and probability curves without crossing. Econometrica 78 1093–1125.
• [6] Cunha, F., Heckman, J. and Schennach, S. (2010). Estimating the technology of cognitive and noncognitive skill formation. Econometrica 78 883–931.
• [7] Doksum, K. (1974). Empirical probability plots and statistical inference for nonlinear models in the two-sample case. Ann. Statist. 2 267–277.
• [8] Ducpetiaux, E. (1855). Budgets économiques des Classes Ouvrières en Belgique, Bruxelles.
• [9] Dudley, R. M. and Philipp, W. (1983). Invariance principles for sums of Banach space valued random elements and empirical processes. Z. Wahrsch. Verw. Gebiete 62 509–552.
• [10] Ekeland, I., Galichon, A. and Henry, M. (2012). Comonotonic measures of multivariate risks. Math. Finance 22 109–132.
• [11] Ekeland, I. and Témam, R. (1999). Convex Analysis and Variational Problems, English ed. Classics in Applied Mathematics 28. SIAM, Philadelphia, PA.
• [12] Engel, E. (1857). Die Produktions und Konsumptionsverhältnisse des Königreichs Sachsen. Zeitschrift des Statistischen Bureaus des Königlich Sächsischen Misisteriums des Innerm 8 1–54.
• [13] Hallin, M., Paindaveine, D. and Šiman, M. (2010). Multivariate quantiles and multiple-output regression quantiles: From $L_{1}$ optimization to halfspace depth. Ann. Statist. 38 635–669.
• [14] Koenker, R. (2005). Quantile Regression. Econometric Society Monographs 38. Cambridge Univ. Press, Cambridge.
• [15] Koenker, R. and Bassett, G. Jr. (1978). Regression quantiles. Econometrica 46 33–50.
• [16] Koenker, R. and Bassett, G. Jr. (1982). Robust tests for heteroscedasticity based on regression quantiles. Econometrica 50 43–61.
• [17] Koltchinskii, V. I. (1997). $M$-estimation, convexity and quantiles. Ann. Statist. 25 435–477.
• [18] Kong, L. and Mizera, I. (2012). Quantile tomography: Using quantiles with multivariate data. Statist. Sinica 22 1589–1610.
• [19] Lehmann, E. L. (1974). Nonparametrics: Statistical Methods Based on Ranks. Holden-Day, San Francisco, CA.
• [20] Le Play, F. (1855). Les Ouvriers Européens. Etudes sur les travaux, la vie domestique et la condition morale des populations ouvrières de l’Europe. Paris.
• [21] Matzkin, R. L. (2003). Nonparametric estimation of nonadditive random functions. Econometrica 71 1339–1375.
• [22] McCann, R. J. (1995). Existence and uniqueness of monotone measure-preserving maps. Duke Math. J. 80 309–323.
• [23] Serfling, R. (2004). Nonparametric multivariate descriptive measures based on spatial quantiles. J. Statist. Plann. Inference 123 259–278.
• [24] Villani, C. (2003). Topics in Optimal Transportation. Graduate Studies in Mathematics 58. Amer. Math. Soc., Providence, RI.
• [25] Villani, C. (2009). Optimal Transport: Old and New. Grundlehren der Mathematischen Wissenschaften 338. Springer, Berlin.
• [26] Wei, Y. (2008). An approach to multivariate covariate-dependent quantile contours with application to bivariate conditional growth charts. J. Amer. Statist. Assoc. 103 397–409.
• [27] Yu, K. and Jones, M. C. (1998). Local linear quantile regression. J. Amer. Statist. Assoc. 93 228–237.

#### Supplemental materials

• Supplement to “Vector quantile regression”. In the online supplement [3], we provide additional results for Sections 2 and 3, including a proof of duality for CVQF and Linear VQR, and the measurability claims for Theorem 2.1.