Standard tests of the “no-treatment-effect” hypothesis for a comparative experiment include permutation tests, the Wilcoxon rank sum test, two-sample $t$ tests, and Fisher-type randomization tests. Practitioners are aware that these procedures test different no-effect hypotheses and are based on different modeling assumptions. However, this awareness is not always, or even usually, accompanied by a clear understanding or appreciation of these differences. Borrowing from the rich literatures on causality and finite-population sampling theory, this paper develops a modeling framework that affords answers to several important questions, including: exactly what hypothesis is being tested, what model assumptions are being made, and are there other, perhaps better, approaches to testing a no-effect hypothesis? The framework lends itself to clear descriptions of three main inference approaches: process-based, randomization-based, and selection-based. It also promotes careful consideration of model assumptions and targets of inference, and highlights the importance of randomization. Along the way, Fisher-type randomization tests are compared to permutation tests and a less well-known Neyman-type randomization test. A simulation study compares the operating characteristics of the Neyman-type randomization test to those of the other more familiar tests.
"A Closer Look at Testing the “No-Treatment-Effect” Hypothesis in a Comparative Experiment." Statist. Sci. 30 (3) 352 - 371, August 2015. https://doi.org/10.1214/15-STS513