Abstract
We describe a method for quantifying the lack of fit in a proposed family of distributions. The method involves estimating the posterior distribution of the Kullback-Leibler information between the true distribution generating the data and the proposed family. We include an implementation for discrete data involving Dirichlet Processes, for continuous data involving Dirichlet Process Mixtures, and for regression data involving a common "perturbation" distribution also estimated by a Dirichlet Process Mixture. We examine the effectiveness of the method through simulation. We also show that, for independent, identically distributed discrete data, the posterior distribution from a Dirichlet Process provides a consistent estimate of the KL information. Because the entire posterior distribution is computed, one can readily acquire interval estimates of the distance without resorting to asymptotics.
Citation
Kert Viele. "Nonparametric estimation of Kullback-Leibler information illustrated by evaluating goodness of fit." Bayesian Anal. 2 (2) 239 - 280, June 2007. https://doi.org/10.1214/07-BA210
Information