"Discrimination information," or Kullback-Leibler loss, is an appropriate measure of distance in problems of discrimination. We examine it in the context of nonparametric kernel density estimation and show that its asymptotic properties are profoundly influenced by tail properties of the kernel and of the unknown density. We suggest ways of choosing the kernel so as to reduce loss, and describe the extent to which likelihood cross-validation asymptotically minimises loss. Likelihood cross-validation generally leads to selection of a window width of the correct order of magnitude, but not necessarily to a window with the correct first-order properties. However, if the kernel is chosen appropriately, then likelihood cross-validation does result in asymptotic minimisation of Kullback-Leibler loss.
"On Kullback-Leibler Loss and Density Estimation." Ann. Statist. 15 (4) 1491 - 1519, December, 1987. https://doi.org/10.1214/aos/1176350606