The reproducibility crisis has led to an increasing number of replication studies being conducted. Sample sizes for replication studies are often calculated using conditional power based on the effect estimate from the original study. However, this approach is not well suited as it ignores the uncertainty of the original result. Bayesian methods are used in clinical trials to incorporate prior information into power calculations. We propose to adapt this methodology to the replication framework and promote the use of predictive instead of conditional power in the design of replication studies. Moreover, we describe how extensions of the methodology to sequential clinical trials can be tailored to replication studies. Conditional and predictive power calculated at an interim analysis are compared and we argue that predictive power is a useful tool to decide whether to stop a replication study prematurely. A recent project on the replicability of social sciences is used to illustrate the properties of the different methods.
This work was funded by the Swiss National Science Foundation (project 189295).
We thank Samuel Pawel, Małgorzata Roos and Lawrence L. Kupper for helpful comments and suggestions on this manuscript. We also would like to thank the referees whose comments helped to improve and clarify the manuscript.
"Power Calculations for Replication Studies." Statist. Sci. 37 (3) 369 - 379, August 2022. https://doi.org/10.1214/21-STS828