A data-driven method of choosing the bandwidth, $h$, of a kernel density estimator is heuristically motivated by considering modifications of the Kullback-Leibler or pseudo-likelihood cross-validation function. It is seen that this means of choosing $h$ is asymptotically equivalent to taking the $h$ that minimizes some compelling error criteria such as the average squared error and the integrated squared error. Thus, for a given kernel function, the bandwidth can be chosen optimally without making precise smoothness assumptions on the underlying density.
"An Asymptotically Efficient Solution to the Bandwidth Problem of Kernel Density Estimation." Ann. Statist. 13 (3) 1011 - 1023, September, 1985. https://doi.org/10.1214/aos/1176349653