Open Access
June 2019 Distributed inference for quantile regression processes
Stanislav Volgushev, Shih-Kang Chao, Guang Cheng
Ann. Statist. 47(3): 1634-1662 (June 2019). DOI: 10.1214/18-AOS1730

Abstract

The increased availability of massive data sets provides a unique opportunity to discover subtle patterns in their distributions, but also imposes overwhelming computational challenges. To fully utilize the information contained in big data, we propose a two-step procedure: (i) estimate conditional quantile functions at different levels in a parallel computing environment; (ii) construct a conditional quantile regression process through projection based on these estimated quantile curves. Our general quantile regression framework covers both linear models with fixed or growing dimension and series approximation models. We prove that the proposed procedure does not sacrifice any statistical inferential accuracy provided that the number of distributed computing units and quantile levels are chosen properly. In particular, a sharp upper bound for the former and a sharp lower bound for the latter are derived to capture the minimal computational cost from a statistical perspective. As an important application, the statistical inference on conditional distribution functions is considered. Moreover, we propose computationally efficient approaches to conducting inference in the distributed estimation setting described above. Those approaches directly utilize the availability of estimators from subsamples and can be carried out at almost no additional computational cost. Simulations confirm our statistical inferential theory.

Citation

Download Citation

Stanislav Volgushev. Shih-Kang Chao. Guang Cheng. "Distributed inference for quantile regression processes." Ann. Statist. 47 (3) 1634 - 1662, June 2019. https://doi.org/10.1214/18-AOS1730

Information

Received: 1 February 2017; Revised: 1 March 2018; Published: June 2019
First available in Project Euclid: 13 February 2019

zbMATH: 07053521
MathSciNet: MR3911125
Digital Object Identifier: 10.1214/18-AOS1730

Subjects:
Primary: 62F12 , 62G15 , 62G20

Keywords: B-spline estimation , conditional distribution function , distributed computing , divide-and-conquer , quantile regression process

Rights: Copyright © 2019 Institute of Mathematical Statistics

Vol.47 • No. 3 • June 2019
Back to Top