The Annals of Statistics

Rerandomization to improve covariate balance in experiments

Kari Lock Morgan and Donald B. Rubin

Full-text: Open access

Abstract

Randomized experiments are the “gold standard” for estimating causal effects, yet often in practice, chance imbalances exist in covariate distributions between treatment groups. If covariate data are available before units are exposed to treatments, these chance imbalances can be mitigated by first checking covariate balance before the physical experiment takes place. Provided a precise definition of imbalance has been specified in advance, unbalanced randomizations can be discarded, followed by a rerandomization, and this process can continue until a randomization yielding balance according to the definition is achieved. By improving covariate balance, rerandomization provides more precise and trustworthy estimates of treatment effects.

Article information

Source
Ann. Statist. Volume 40, Number 2 (2012), 1263-1282.

Dates
First available in Project Euclid: 18 July 2012

Permanent link to this document
https://projecteuclid.org/euclid.aos/1342625468

Digital Object Identifier
doi:10.1214/12-AOS1008

Mathematical Reviews number (MathSciNet)
MR2985950

Zentralblatt MATH identifier
1274.62509

Subjects
Primary: 62K99: None of the above, but in this section

Keywords
Randomization treatment allocation experimental design clinical trial causal effect Mahalanobis distance Hotelling’s $T^{2}$

Citation

Morgan, Kari Lock; Rubin, Donald B. Rerandomization to improve covariate balance in experiments. Ann. Statist. 40 (2012), no. 2, 1263--1282. doi:10.1214/12-AOS1008. https://projecteuclid.org/euclid.aos/1342625468.


Export citation

References

  • Aickin, M. (2001). Randomization, balance, and the validity and efficiency of design-adaptive allocation methods. J. Statist. Plann. Inference 94 97–119.
  • Anscombe, F. J. (1948a). The validity of comparative experiments. J. Roy. Statist. Soc. Ser. A. 111 181–211.
  • Arnold, G. C. (1986). Randomization: A historic controversy. In The Fascination of Statistics (R. J. Brook, G. C. Arnold, T. H. Hassard and R. M. Pringle, eds.) 231–244. CRC Press, Boca Raton, FL.
  • Atkinson, A. C. (2002). The comparison of designs for sequential clinical trials with covariate information. J. Roy. Statist. Soc. Ser. A 165 349–373.
  • Bailey, R. A. (1983). Restricted randomization. Biometrika 70 183–198.
  • Bailey, R. A. (1986). Randomization, constrained. Encyclopedia of Statistical Sciences 7 519–524.
  • Bailey, R. A. and Rowley, C. A. (1987). Valid randomization. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 410 105–124.
  • Birkett, N. J. (1985). Adaptive allocation in randomized controlled trials. Control Clin Trials 6 146–155.
  • Brillinger, D., Jones, L. and Tukey, J. (1978). The Management of Weather Resources II: The Role of Statistics in Weather Resources Management. US Government Printing Office, Washington, DC.
  • Bruhn, M. and McKenzie, D. (2009). In pursuit of balance: Randomization in practice in development field experiments. American Economic Journal: Applied Economics 1 200–232.
  • Cochran, W. G. and Rubin, D. B. (1973). Controlling bias in observational studies: A review. Sankhyā Ser. A 35 417–446.
  • Cox, D. R. (1982). Randomization and concomitant variables in the design of experiments. In Statistics and Probability: Essays in Honor of C. R. Rao 197–202. North-Holland, Amsterdam.
  • Cox, D. R. (2009). Randomization in the Design of Experiments. International Statistical Review 77 415–429.
  • Efron, B. (1971). Forcing a sequential experiment to be balanced. Biometrika 58 403–417.
  • Erdős, P. and Rényi, A. (1959). On the central limit theorem for samples from a finite population. Magyar Tud. Akad. Mat. Kutató Int. Közl. 4 49–61.
  • Fisher, R. A. (1926). The arrangement of field experiments. Journal of the Ministry of Agriculture of Great Britain 33 503–513.
  • Fisher, R. A. (1935). The Design of Experiments. Oliver and Boyd, Edinburgh.
  • Freedman, D. A. (2008). On regression adjustments to experimental data. Adv. in Appl. Math. 40 180–193.
  • Garthwaite, P. H. (1996). Confidence intervals from randomization tests. Biometrics 1387–1393.
  • Gosset, W. J. (1938). Comparison between balanced and random arrangements of field plots. Biometrika 29 363.
  • Greenberg, B. G. (1951). Why randomize? Biometrics 7 309–322.
  • Greevy, R., Lu, B., Silber, J. H. and Rosenbaum, P. (2004). Optimal multivariate matching before randomization. Biostatistics 5 263–275.
  • Grundy, P. M. and Healy, M. J. R. (1950). Restricted randomization and quasi-Latin squares. J. R. Stat. Soc. Ser. B Stat. Methodol. 12 286–291.
  • Hájek, J. (1960). Limiting distributions in simple random sampling from a finite population. Magyar Tud. Akad. Mat. Kutató Int. Közl. 5 361–374.
  • Hansen, B. B. and Bowers, J. (2008). Covariate balance in simple, stratified and clustered comparative studies. Statist. Sci. 23 219–236.
  • Harville, D. A. (1975). Experimental randomization: Who needs it? Amer. Statist. 27–31.
  • Ho, D. E., Imai, K., King, G. and Stuart, E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Analysis 15 199–236.
  • Holschuh, N. (1980). Randomization and design: I. In R. A. Fisher: An Appreciation (S. E. Fienberg and D. V. Hinkley, eds.). Lecture Notes in Statistics 1 35–45. Springer, New York.
  • Imai, K., King, G. and Stuart, E. A. (2008). Misunderstanding between experimentalists and observationalists about causal inference. J. Roy. Statist. Soc. Ser. A 171 481–502.
  • Imai, K., King, G. and Nall, C. (2009). The essential role of pair matching in cluster-randomized experiments, with application to the Mexican universal health insurance evaluation. Statist. Sci. 24 29–53.
  • Keele, L., McConnaughy, C., White, I., List, P. M. E. M. and Bailey, D. (2009). Adjusting experimental data. In Experiments in Political Science Conference.
  • Kempthorne, O. (1955). The randomization theory of experimental inference. J. Amer. Statist. Assoc. 50 946–967.
  • Kempthorne, O. (1986). Randomization II. Encyclopedia of Statistical Sciences 7 519–524.
  • Krause, M. S. and Howard, K. I. (2003). What random assignment does and does not do. Journal of Clinical Psychology 59 751–766.
  • Lehmann, E. L. and Romano, J. P. (2005). Testing Statistical Hypotheses, 3rd ed. Springer Texts in Statistics. Springer, New York.
  • Lock, K. F. (2011). Rerandomization to improve covariate balance in randomized experiments Ph.D. thesis, Harvard Univ., Cambridge, MA.
  • Maclure, M., Nguyen, A., Carney, G., Dormuth, C., Roelants, H., Ho, K. and Schneeweiss, S. (2006). Measuring prescribing improvements in pragmatic trials of educational tools for general practitioners. Basic & Clinical Pharmacology & Toxicology 98 243–252.
  • Manly, B. F. J. (2007). Randomization, Bootstrap and Monte Carlo Methods in Biology, 3rd ed. Chapman & Hall/CRC, Boca Raton, FL.
  • Mardia, K. V., Kent, J. T. and Bibby, J. M. (1980). Multivariate Analysis. Academic Press, London.
  • McEntegart, D. J. (2003). The pursuit of balance using stratified and dynamic randomization techniques: An overview. Drug Information Journal 37 293–308.
  • Morris, C. (1979). A finite selection model for experimental design of the health insurance study. J. Econometrics 11 43–61.
  • Morris, C. N. and Hill, J. L. (2000). The health insurance experiment: Design using the finite selection model. In Public Policy and Statistics: Case Studies from RAND 29–53. Springer, New York.
  • Moulton, L. H. (2004). Covariate-based constrained randomization of group-randomized trials. Clin Trials 1 297–305.
  • Pocock, S. J. (1979). Allocation of patients to treatment in clinical trials. Biometrics 35 183–197.
  • Pocock, S. J. and Simon, R. (1975). Sequential treatment assignment with balancing for prognostic factors in the controlled clinical trial. Biometrics 31 103–115.
  • Raynor, A. A. (1986). Some Sidelights on Experimental Design. In The Fascination of Statistics (R. J. Brook, G. C. Arnold, T. H. Hassard and R. M. Pringle, eds.) 245–264. CRC Press, Boca Raton, FL.
  • Rosenberger, W. F. and Lachin, J. M. (2002). Randomization in Clinical Trials: Theory and Practice. Wiley, New York.
  • Rosenberger, W. F. and Sverdlov, O. (2008). Handling covariates in the design of clinical trials. Statist. Sci. 23 404–419.
  • Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 66 688.
  • Rubin, D. B. (1976). Multivariate matching methods that are equal percent bias reducing. I. Some examples. Biometrics 32 109–120.
  • Rubin, D. B. (1978). Bayesian inference for causal effects: The role of randomization. Ann. Statist. 6 34–58.
  • Rubin, D. B. (1980). Randomization analysis of experimental data: The Fisher randomization test comment. J. Amer. Statist. Assoc. 75 591–593.
  • Rubin, D. B. (2006). Matched Sampling for Causal Effects. Cambridge Univ. Press, Cambridge.
  • Rubin, D. B. (2008a). Comment: The design and analysis of gold standard randomized experiments. J. Amer. Statist. Assoc. 103 1350–1353.
  • Rubin, D. B. (2008b). For objective causal inference, design trumps analysis. Ann. Appl. Stat. 2 808–804.
  • Rubin, D. B. and Thomas, N. (1992). Affinely invariant matching methods with ellipsoidal distributions. Ann. Statist. 20 1079–1093.
  • Savage, L. J. (1962). The Foundations of Statistical Inference. Methuen & Co. Ltd., London.
  • Scott, N. W., McPherson, G. C., Ramsay, C. R. and Campbell, M. K. (2002). The method of minimization for allocation to clinical trials. a review. Control Clinical Trials 23 662–674.
  • Seidenfeld, T. (1981). Levi on the dogma of randomization in experiments. In Henry E. Kyburg, Jr. & Isaac Levi (R. J. Bogdan, ed.) 263–291. Springer, Berlin.
  • Simon, R. (1979). Restricted randomization designs in clinical trials. Biometrics 35 503–512.
  • Soares, J. F. and Wu, C. F. J. (1985). Optimality of random allocation design for the control of accidental bias in sequential experiments. J. Statist. Plann. Inference 11 81–87.
  • Splawa-Neyman, J. (1990). On the application of probability theory to agricultural experiments. Essay on principles. Section 9. Statist. Sci. 5 465–472.
  • Sprott, D. A. and Farewell, V. T. (1993). Randomization in experimental science. Statist. Papers 34 89–94.
  • Tukey, J. W. (1993). Tightening the clinical trial. Control Clin Trials 14 266–285.
  • Urbach, P. (1985). Randomization and the design of experiments. Philos. Sci. 52 256–273.
  • White, S. J. and Freedman, L. S. (1978). Allocation of patients to treatment groups in a controlled clinical study. British Journal of Cancer 37 849.
  • Worrall, J. (2010). Evidence: Philosophy of science meets medicine. J. Eval. Clin. Pract. 16 356–362.
  • Xu, Z. and Kalbfleisch, J. D. (2010). Propensity score matching in randomized clinical trials. Biometrics 66 813–823.
  • Yates, F. (1939). The comparative advantages of systematic and randomized arrangements in the design of agricultural and biological experiments. Biometrika 30 440.
  • Yates, F. (1948). Contribution to the discussion of “The validity of comparative experiments” by FJ Anscombe. J. Roy. Statist. Soc. Ser. A 111 204–205.
  • Youden, W. J. (1972). Randomization and experimentation. Technometrics 14 13–22.