Translator Disclaimer
September 2020 Statistical methods for replicability assessment
Kenneth Hung, William Fithian
Ann. Appl. Stat. 14(3): 1063-1087 (September 2020). DOI: 10.1214/20-AOAS1336


Large-scale replication studies like the Reproducibility Project: Psychology (RP:P) provide invaluable systematic data on scientific replicability, but most analyses and interpretations of the data fail to agree on the definition of “replicability” and disentangle the inexorable consequences of known selection bias from competing explanations. We discuss three concrete definitions of replicability based on: (1) whether published findings about the signs of effects are mostly correct, (2) how effective replication studies are in reproducing whatever true effect size was present in the original experiment and (3) whether true effect sizes tend to diminish in replication. We apply techniques from multiple testing and postselection inference to develop new methods that answer these questions while explicitly accounting for selection bias. Our analyses suggest that the RP:P dataset is largely consistent with publication bias due to selection of significant effects. The methods in this paper make no distributional assumptions about the true effect sizes.


Download Citation

Kenneth Hung. William Fithian. "Statistical methods for replicability assessment." Ann. Appl. Stat. 14 (3) 1063 - 1087, September 2020.


Received: 1 April 2019; Revised: 1 February 2020; Published: September 2020
First available in Project Euclid: 18 September 2020

MathSciNet: MR4152124
Digital Object Identifier: 10.1214/20-AOAS1336

Keywords: Meta-analysis , multiple testing , postselection inference , publication bias , replicability

Rights: Copyright © 2020 Institute of Mathematical Statistics


This article is only available to subscribers.
It is not available for individual sale.

Vol.14 • No. 3 • September 2020
Back to Top