The Annals of Statistics
- Ann. Statist.
- Volume 40, Number 2 (2012), 1263-1282.
Rerandomization to improve covariate balance in experiments
Kari Lock Morgan and Donald B. Rubin
Full-text: Access has been disabled (more information)
Abstract
Randomized experiments are the “gold standard” for estimating causal effects, yet often in practice, chance imbalances exist in covariate distributions between treatment groups. If covariate data are available before units are exposed to treatments, these chance imbalances can be mitigated by first checking covariate balance before the physical experiment takes place. Provided a precise definition of imbalance has been specified in advance, unbalanced randomizations can be discarded, followed by a rerandomization, and this process can continue until a randomization yielding balance according to the definition is achieved. By improving covariate balance, rerandomization provides more precise and trustworthy estimates of treatment effects.
Article information
Source
Ann. Statist. Volume 40, Number 2 (2012), 1263-1282.
Dates
First available in Project Euclid: 18 July 2012
Permanent link to this document
http://projecteuclid.org/euclid.aos/1342625468
Digital Object Identifier
doi:10.1214/12-AOS1008
Mathematical Reviews number (MathSciNet)
MR2985950
Zentralblatt MATH identifier
1274.62509
Subjects
Primary: 62K99: None of the above, but in this section
Keywords
Randomization treatment allocation experimental design clinical trial causal effect Mahalanobis distance Hotelling’s $T^{2}$
Citation
Morgan, Kari Lock; Rubin, Donald B. Rerandomization to improve covariate balance in experiments. Ann. Statist. 40 (2012), no. 2, 1263--1282. doi:10.1214/12-AOS1008. http://projecteuclid.org/euclid.aos/1342625468.
References
- Aickin, M. (2001). Randomization, balance, and the validity and efficiency of design-adaptive allocation methods. J. Statist. Plann. Inference 94 97–119.Mathematical Reviews (MathSciNet): MR1820173
Zentralblatt MATH: 0976.62099
Digital Object Identifier: doi:10.1016/S0378-3758(00)00228-7 - Anscombe, F. J. (1948a). The validity of comparative experiments. J. Roy. Statist. Soc. Ser. A. 111 181–211.
- Arnold, G. C. (1986). Randomization: A historic controversy. In The Fascination of Statistics (R. J. Brook, G. C. Arnold, T. H. Hassard and R. M. Pringle, eds.) 231–244. CRC Press, Boca Raton, FL.
- Atkinson, A. C. (2002). The comparison of designs for sequential clinical trials with covariate information. J. Roy. Statist. Soc. Ser. A 165 349–373.Mathematical Reviews (MathSciNet): MR1904822
Zentralblatt MATH: 1001.62522
Digital Object Identifier: doi:10.1111/1467-985X.00564 - Bailey, R. A. (1983). Restricted randomization. Biometrika 70 183–198.Mathematical Reviews (MathSciNet): MR742988
Zentralblatt MATH: 0517.62069
Digital Object Identifier: doi:10.1093/biomet/70.1.183 - Bailey, R. A. (1986). Randomization, constrained. Encyclopedia of Statistical Sciences 7 519–524.
- Bailey, R. A. and Rowley, C. A. (1987). Valid randomization. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 410 105–124.
- Birkett, N. J. (1985). Adaptive allocation in randomized controlled trials. Control Clin Trials 6 146–155.
- Brillinger, D., Jones, L. and Tukey, J. (1978). The Management of Weather Resources II: The Role of Statistics in Weather Resources Management. US Government Printing Office, Washington, DC.
- Bruhn, M. and McKenzie, D. (2009). In pursuit of balance: Randomization in practice in development field experiments. American Economic Journal: Applied Economics 1 200–232.
- Cochran, W. G. and Rubin, D. B. (1973). Controlling bias in observational studies: A review. Sankhyā Ser. A 35 417–446.
- Cox, D. R. (1982). Randomization and concomitant variables in the design of experiments. In Statistics and Probability: Essays in Honor of C. R. Rao 197–202. North-Holland, Amsterdam.
- Cox, D. R. (2009). Randomization in the Design of Experiments. International Statistical Review 77 415–429.
- Efron, B. (1971). Forcing a sequential experiment to be balanced. Biometrika 58 403–417.Mathematical Reviews (MathSciNet): MR312660
Zentralblatt MATH: 0226.62086
Digital Object Identifier: doi:10.1093/biomet/58.3.403 - Erdős, P. and Rényi, A. (1959). On the central limit theorem for samples from a finite population. Magyar Tud. Akad. Mat. Kutató Int. Közl. 4 49–61.Mathematical Reviews (MathSciNet): MR107294
- Fisher, R. A. (1926). The arrangement of field experiments. Journal of the Ministry of Agriculture of Great Britain 33 503–513.
- Fisher, R. A. (1935). The Design of Experiments. Oliver and Boyd, Edinburgh.
- Freedman, D. A. (2008). On regression adjustments to experimental data. Adv. in Appl. Math. 40 180–193.Mathematical Reviews (MathSciNet): MR2388610
Zentralblatt MATH: 1130.62003
Digital Object Identifier: doi:10.1016/j.aam.2006.12.003 - Garthwaite, P. H. (1996). Confidence intervals from randomization tests. Biometrics 1387–1393.
- Gosset, W. J. (1938). Comparison between balanced and random arrangements of field plots. Biometrika 29 363.
- Greenberg, B. G. (1951). Why randomize? Biometrics 7 309–322.
- Greevy, R., Lu, B., Silber, J. H. and Rosenbaum, P. (2004). Optimal multivariate matching before randomization. Biostatistics 5 263–275.
- Grundy, P. M. and Healy, M. J. R. (1950). Restricted randomization and quasi-Latin squares. J. R. Stat. Soc. Ser. B Stat. Methodol. 12 286–291.
- Hájek, J. (1960). Limiting distributions in simple random sampling from a finite population. Magyar Tud. Akad. Mat. Kutató Int. Közl. 5 361–374.Mathematical Reviews (MathSciNet): MR125612
- Hansen, B. B. and Bowers, J. (2008). Covariate balance in simple, stratified and clustered comparative studies. Statist. Sci. 23 219–236.Mathematical Reviews (MathSciNet): MR2516821
Digital Object Identifier: doi:10.1214/08-STS254
Project Euclid: euclid.ss/1219339114 - Harville, D. A. (1975). Experimental randomization: Who needs it? Amer. Statist. 27–31.
- Ho, D. E., Imai, K., King, G. and Stuart, E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Analysis 15 199–236.
- Holschuh, N. (1980). Randomization and design: I. In R. A. Fisher: An Appreciation (S. E. Fienberg and D. V. Hinkley, eds.). Lecture Notes in Statistics 1 35–45. Springer, New York.Mathematical Reviews (MathSciNet): MR578886
- Imai, K., King, G. and Stuart, E. A. (2008). Misunderstanding between experimentalists and observationalists about causal inference. J. Roy. Statist. Soc. Ser. A 171 481–502.Mathematical Reviews (MathSciNet): MR2427345
Digital Object Identifier: doi:10.1111/j.1467-985X.2007.00527.x - Imai, K., King, G. and Nall, C. (2009). The essential role of pair matching in cluster-randomized experiments, with application to the Mexican universal health insurance evaluation. Statist. Sci. 24 29–53.Mathematical Reviews (MathSciNet): MR2561126
Digital Object Identifier: doi:10.1214/08-STS274
Project Euclid: euclid.ss/1255009008 - Keele, L., McConnaughy, C., White, I., List, P. M. E. M. and Bailey, D. (2009). Adjusting experimental data. In Experiments in Political Science Conference.
- Kempthorne, O. (1955). The randomization theory of experimental inference. J. Amer. Statist. Assoc. 50 946–967.Mathematical Reviews (MathSciNet): MR71696
- Kempthorne, O. (1986). Randomization II. Encyclopedia of Statistical Sciences 7 519–524.
- Krause, M. S. and Howard, K. I. (2003). What random assignment does and does not do. Journal of Clinical Psychology 59 751–766.
- Lehmann, E. L. and Romano, J. P. (2005). Testing Statistical Hypotheses, 3rd ed. Springer Texts in Statistics. Springer, New York.Mathematical Reviews (MathSciNet): MR2135927
- Lock, K. F. (2011). Rerandomization to improve covariate balance in randomized experiments Ph.D. thesis, Harvard Univ., Cambridge, MA.Mathematical Reviews (MathSciNet): MR2941908
- Maclure, M., Nguyen, A., Carney, G., Dormuth, C., Roelants, H., Ho, K. and Schneeweiss, S. (2006). Measuring prescribing improvements in pragmatic trials of educational tools for general practitioners. Basic & Clinical Pharmacology & Toxicology 98 243–252.
- Manly, B. F. J. (2007). Randomization, Bootstrap and Monte Carlo Methods in Biology, 3rd ed. Chapman & Hall/CRC, Boca Raton, FL.Mathematical Reviews (MathSciNet): MR2257066
- Mardia, K. V., Kent, J. T. and Bibby, J. M. (1980). Multivariate Analysis. Academic Press, London.Mathematical Reviews (MathSciNet): MR560319
- McEntegart, D. J. (2003). The pursuit of balance using stratified and dynamic randomization techniques: An overview. Drug Information Journal 37 293–308.
- Morris, C. (1979). A finite selection model for experimental design of the health insurance study. J. Econometrics 11 43–61.
- Morris, C. N. and Hill, J. L. (2000). The health insurance experiment: Design using the finite selection model. In Public Policy and Statistics: Case Studies from RAND 29–53. Springer, New York.
- Moulton, L. H. (2004). Covariate-based constrained randomization of group-randomized trials. Clin Trials 1 297–305.
- Pocock, S. J. (1979). Allocation of patients to treatment in clinical trials. Biometrics 35 183–197.
- Pocock, S. J. and Simon, R. (1975). Sequential treatment assignment with balancing for prognostic factors in the controlled clinical trial. Biometrics 31 103–115.
- Raynor, A. A. (1986). Some Sidelights on Experimental Design. In The Fascination of Statistics (R. J. Brook, G. C. Arnold, T. H. Hassard and R. M. Pringle, eds.) 245–264. CRC Press, Boca Raton, FL.
- Rosenberger, W. F. and Lachin, J. M. (2002). Randomization in Clinical Trials: Theory and Practice. Wiley, New York.Mathematical Reviews (MathSciNet): MR1914364
- Rosenberger, W. F. and Sverdlov, O. (2008). Handling covariates in the design of clinical trials. Statist. Sci. 23 404–419.Mathematical Reviews (MathSciNet): MR2483911
Digital Object Identifier: doi:10.1214/08-STS269
Project Euclid: euclid.ss/1233153066 - Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 66 688.
- Rubin, D. B. (1976). Multivariate matching methods that are equal percent bias reducing. I. Some examples. Biometrics 32 109–120.
- Rubin, D. B. (1978). Bayesian inference for causal effects: The role of randomization. Ann. Statist. 6 34–58.Mathematical Reviews (MathSciNet): MR472152
Zentralblatt MATH: 0383.62021
Digital Object Identifier: doi:10.1214/aos/1176344064
Project Euclid: euclid.aos/1176344064 - Rubin, D. B. (1980). Randomization analysis of experimental data: The Fisher randomization test comment. J. Amer. Statist. Assoc. 75 591–593.Mathematical Reviews (MathSciNet): MR590687
Zentralblatt MATH: 0444.62089
Digital Object Identifier: doi:10.1080/01621459.1980.10477512 - Rubin, D. B. (2006). Matched Sampling for Causal Effects. Cambridge Univ. Press, Cambridge.
- Rubin, D. B. (2008a). Comment: The design and analysis of gold standard randomized experiments. J. Amer. Statist. Assoc. 103 1350–1353.Mathematical Reviews (MathSciNet): MR2655717
Digital Object Identifier: doi:10.1198/016214508000001011 - Rubin, D. B. (2008b). For objective causal inference, design trumps analysis. Ann. Appl. Stat. 2 808–804.Mathematical Reviews (MathSciNet): MR2516795
Zentralblatt MATH: 1149.62089
Digital Object Identifier: doi:10.1214/08-AOAS187
Project Euclid: euclid.aoas/1223908042 - Rubin, D. B. and Thomas, N. (1992). Affinely invariant matching methods with ellipsoidal distributions. Ann. Statist. 20 1079–1093.Mathematical Reviews (MathSciNet): MR1165607
Zentralblatt MATH: 0761.62065
Digital Object Identifier: doi:10.1214/aos/1176348671
Project Euclid: euclid.aos/1176348671 - Savage, L. J. (1962). The Foundations of Statistical Inference. Methuen & Co. Ltd., London.Mathematical Reviews (MathSciNet): MR146908
- Scott, N. W., McPherson, G. C., Ramsay, C. R. and Campbell, M. K. (2002). The method of minimization for allocation to clinical trials. a review. Control Clinical Trials 23 662–674.
- Seidenfeld, T. (1981). Levi on the dogma of randomization in experiments. In Henry E. Kyburg, Jr. & Isaac Levi (R. J. Bogdan, ed.) 263–291. Springer, Berlin.
- Simon, R. (1979). Restricted randomization designs in clinical trials. Biometrics 35 503–512.
- Soares, J. F. and Wu, C. F. J. (1985). Optimality of random allocation design for the control of accidental bias in sequential experiments. J. Statist. Plann. Inference 11 81–87.Mathematical Reviews (MathSciNet): MR783375
Zentralblatt MATH: 0574.62068
Digital Object Identifier: doi:10.1016/0378-3758(85)90027-8 - Splawa-Neyman, J. (1990). On the application of probability theory to agricultural experiments. Essay on principles. Section 9. Statist. Sci. 5 465–472.
- Sprott, D. A. and Farewell, V. T. (1993). Randomization in experimental science. Statist. Papers 34 89–94.
- Tukey, J. W. (1993). Tightening the clinical trial. Control Clin Trials 14 266–285.
- Urbach, P. (1985). Randomization and the design of experiments. Philos. Sci. 52 256–273.
- White, S. J. and Freedman, L. S. (1978). Allocation of patients to treatment groups in a controlled clinical study. British Journal of Cancer 37 849.
- Worrall, J. (2010). Evidence: Philosophy of science meets medicine. J. Eval. Clin. Pract. 16 356–362.
- Xu, Z. and Kalbfleisch, J. D. (2010). Propensity score matching in randomized clinical trials. Biometrics 66 813–823.Mathematical Reviews (MathSciNet): MR2758217
Digital Object Identifier: doi:10.1111/j.1541-0420.2009.01364.x - Yates, F. (1939). The comparative advantages of systematic and randomized arrangements in the design of agricultural and biological experiments. Biometrika 30 440.
- Yates, F. (1948). Contribution to the discussion of “The validity of comparative experiments” by FJ Anscombe. J. Roy. Statist. Soc. Ser. A 111 204–205.
- Youden, W. J. (1972). Randomization and experimentation. Technometrics 14 13–22.

- You have access to this content.
- You have partial access to this content.
- You do not have access to this content.
More like this
- For objective causal inference, design trumps
analysis
Rubin, Donald B., The Annals of Applied Statistics, 2008 - Improving covariate balance in 2K factorial designs via rerandomization with an application to a New York City Department of Education High School Study
Branson, Zach, Dasgupta, Tirthankar, and Rubin, Donald B., The Annals of Applied Statistics, 2016 - Agnostic notes on regression adjustments to experimental data: Reexamining Freedman’s critique
Lin, Winston, The Annals of Applied Statistics, 2013
- For objective causal inference, design trumps
analysis
Rubin, Donald B., The Annals of Applied Statistics, 2008 - Improving covariate balance in 2K factorial designs via rerandomization with an application to a New York City Department of Education High School Study
Branson, Zach, Dasgupta, Tirthankar, and Rubin, Donald B., The Annals of Applied Statistics, 2016 - Agnostic notes on regression adjustments to experimental data: Reexamining Freedman’s critique
Lin, Winston, The Annals of Applied Statistics, 2013 - The potential for bias in principal causal effect
estimation when treatment received depends on a key covariate
Zigler, Corwin M. and Belin, Thomas R., The Annals of Applied Statistics, 2011 - Exploiting multiple outcomes in Bayesian principal stratification analysis with application to the evaluation of a job training program
Mattei, Alessandra, Li, Fan, and Mealli, Fabrizia, The Annals of Applied Statistics, 2013 - Clinician preferences and the estimation of causal treatment differences
Korn, Edward L. and Baumrind, Sheldon, Statistical Science, 1998 - Causal Inference Through Potential Outcomes and Principal Stratification: Application to Studies with “Censoring” Due to Death
Rubin, Donald B., Statistical Science, 2006 - A Note on Incomplete Block Designs with Row Balance
Hartley, H. O., Shrikhande, S. S., and Taylor, W. B., The Annals of Mathematical Statistics, 1953 - Causal inference in longitudinal studies with history-restricted marginal structural models
Neugebauer, Romain, van der Laan, Mark J., Joffe, Marshall M., and Tager, Ira B., Electronic Journal of Statistics, 2007 - On a Complete Class of Linear Unbiased Estimators for Randomized Factorial Experiments
Zacks, S., The Annals of Mathematical Statistics, 1963
