This paper proposes a novel method to recover the sparse structure of the conditional distribution, which plays a crucial role in subsequent statistical analysis such as prediction, forecasting, conditional distribution estimation and others. Unlike most existing methods that often require explicit model assumption or suffer from computational burden, the proposed method shows great advantage by making use of some desirable properties of reproducing kernel Hilbert space (RKHS). It can be efficiently implemented by optimizing its dual form and is particularly attractive in dealing with large-scale dataset. The asymptotic consistencies of the proposed method are established under mild conditions. Its effectiveness is also supported by a variety of simulated examples and a real-life supermarket dataset from Northern China.
Xin He’s research is supported in part by NSFC-11901375, Shanghai Pujiang Program 2019PJC051 and the Fundamental Research Funds for the Central Universities. Junhui Wang’s research is supported in part by HK RGC Grants GRF-11303918, GRF-11300919 and GRF-11304520.
The authors thank the editor, the associate editor and the two anonymous referees for their constructive suggestions, which significantly improve this paper.
"Learning sparse conditional distribution: An efficient kernel-based approach." Electron. J. Statist. 15 (1) 1610 - 1635, 2021. https://doi.org/10.1214/21-EJS1824