A test is said to control for type I error if it is unlikely to reject the data-generating process. However, if it is possible to produce stochastic processes at random such that, for all possible future realizations of the data, the selected process is unlikely to be rejected, then the test is said to be manipulable. So, a manipulable test has essentially no capacity to reject a strategic expert.
Many tests proposed in the existing literature, including calibration tests, control for type I error but are manipulable. We construct a test that controls for type I error and is nonmanipulable.
"A nonmanipulable test." Ann. Statist. 37 (2) 1013 - 1039, April 2009. https://doi.org/10.1214/08-AOS597