Bickel and Ritov suggested an optimal estimator for the integral of the square of the kth derivative of a density when the unknown density belongs to a Lipschitz class of a given order $\beta$. In this context optimality means that the estimate is asymptotically efficient, that is, it has the best constant and rate of risk convergence, whenever $\beta > 2k + 1/4$, and it is rate optimal otherwise. The suggested optimal estimator crucially depends on the value of $\beta$ which is obviously unknown. Bickel and Ritov conjectured that the method of cross validation leads to a corresponding adaptive estimator which has the same optimal statistical properties as the optimal estimator based on prior knowledge of $\beta$.
We show for probability densities supported over a finite interval that when $\beta > 2k + 1/4$ adaptation is not necessary for the construction of an asymptotically efficient estimator. On the other hand, it is not possible to construct an adaptive estimator which has the same rate of convergence as the optimal nonadaptive estimator as soon as $k < \beta \leq 2k + 1/4$.
"On Bickel and Ritov's conjecture about adaptive estimation of the integral of the square of density derivative." Ann. Statist. 24 (2) 682 - 686, April 1996. https://doi.org/10.1214/aos/1032894459