Abstract
We consider hierarchical mixtures-of-experts (HME) models where exponential family regression models with generalized linear mean functions of the form $\psi(\alpha + \mathbf{x}^T \mathbf{\beta})$ are mixed. Here $\psi(\cdot)$ is the inverse link function. Suppose the true response $y$ follows an exponential family regression model with mean function belonging to a class of smooth functions of the form $\psi(h(\mathbf{x}))$ where $h(\cdot)\in W_{2; K_0}^{\infty}$ (a Sobolev class over $[0, 1]^s$). It is shown that the HME probability density functions can approximate the true density, at a rate of $O(m^{-2/s})$ in Hellinger distance and at a rate of $O(m^{-4/s})$ in Kullback–Leibler divergence, where $m$ is the number of experts, and $s$ is the dimension of the predictor $x$. We also provide conditions under which the mean-square error of the estimated mean response obtained from the maximum likelihood method converges to zero, as the sample size and the number of experts both increase.
Citation
Wenxin Jiang. Martin A. Tanner. "Hierarchical mixtures-of-experts for exponential family regression models: approximation and maximum likelihood estimation." Ann. Statist. 27 (3) 987 - 1011, June 1999. https://doi.org/10.1214/aos/1018031265
Information