Statistical Science

The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation

Kosuke Imai, Gary King, and Clayton Nall

Full-text: Open access


A basic feature of many field experiments is that investigators are only able to randomize clusters of individuals—such as households, communities, firms, medical practices, schools or classrooms—even when the individual is the unit of interest. To recoup the resulting efficiency loss, some studies pair similar clusters and randomize treatment within pairs. However, many other studies avoid pairing, in part because of claims in the literature, echoed by clinical trials standards organizations, that this matched-pair, cluster-randomization design has serious problems. We argue that all such claims are unfounded. We also prove that the estimator recommended for this design in the literature is unbiased only in situations when matching is unnecessary; its standard error is also invalid. To overcome this problem without modeling assumptions, we develop a simple design-based estimator with much improved statistical properties. We also propose a model-based approach that includes some of the benefits of our design-based estimator as well as the estimator in the literature. Our methods also address individual-level noncompliance, which is common in applications but not allowed for in most existing methods. We show that from the perspective of bias, efficiency, power, robustness or research costs, and in large or small samples, pairing should be used in cluster-randomized experiments whenever feasible; failing to do so is equivalent to discarding a considerable fraction of one’s data. We develop these techniques in the context of a randomized evaluation we are conducting of the Mexican Universal Health Insurance Program.

Article information

Statist. Sci., Volume 24, Number 1 (2009), 29-53.

First available in Project Euclid: 8 October 2009

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Causal inference community intervention trials field experiments group-randomized trials place-randomized trials health policy matched-pair design noncompliance power


Imai, Kosuke; King, Gary; Nall, Clayton. The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation. Statist. Sci. 24 (2009), no. 1, 29--53. doi:10.1214/08-STS274.

Export citation


  • Angrist, J. and Lavy, V. (2002). The effect of high school matriculation awards: Evidence from randomized trials. Working Paper 9389, National Bureau of Economic Research, Washington, DC.
  • Angrist, J. D., Imbens, G. W. and Rubin, D. B. (1996). Identification of causal effects using instrumental variables (with discussion). J. Amer. Statist. Assoc. 91 444–455.
  • Arceneaux, K. (2005). Using cluster randomized field experiments to study voting behavior. The Annals of the American Academy of Political and Social Science 601 169–179.
  • Ball, S. and Bogatz, G. A. (1972). Reading with television: An evaluation of the electric company. Technical Report PR-72-2, Educational Testing Service, Princeton, NJ.
  • Bloom, H. S. (2006). The core analytics of randomized experiments for social research. Technical report, MDRC.
  • Box, G. E., Hunger, W. G. and Hunter, J. S. (1978). Statistics for Experimenters. Wiley, New York.
  • Braun, T. M. and Feng, Z. (2001). Optimal permutation tests for the analysis of group randomized trials. J. Amer. Statist. Assoc. 96 1424–1432.
  • Campbell, M., Elbourne, D. and Altman, D. (2004). CONSORT statement: Extension to cluster randomised trials. BMJ 328 702–708.
  • Campbell, M., Mollison, J. and Grimshaw, J. (2001). Cluster trials in implementation research: Estimation of intracluster correlation coefficients and sample size. Statist. Med. 20 391–399.
  • Campbell, M. J. (2004). Editorial: Extending consort to include cluster trials. BMJ 328 654–655. Available at
  • Cornfield, J. (1978). Randomization by group: A formal analysis. American Journal of Epidemiology 108 100–102.
  • Cox, D. R. (1958). Planning of Experiments. Wiley, New York.
  • Donner, A. (1987). Statistical methodology for paired cluster designs. American Journal of Epidemiology 126 972–979.
  • Donner, A. (1998). Some aspects of the design and analysis of cluster randomization trials. Appl. Statist. 47 95–113.
  • Donner, A. and Donald, A. (1987). Analysis of data arising from a stratified design with the cluster as unit of randomization. Statist. Med. 6 43–52.
  • Donner, A. and Hauck, W. (1989). Estimation of a common odds ration in paired-cluster randomization designs. Statist. Med. 8 599–607.
  • Donner, A. and Klar, N. (1993). Confidence interval construction for effect measures arising from cluster randomization trials. Journal of Clinical Epidemiology 46 123–131.
  • Donner, A. and Klar, N. (2000a). Design and Analysis of Cluster Randomization Trials in Health Research. Oxford Univ. Press, New York.
  • Donner, A. and Klar, N. (2000b). Design and Analysis of Cluster Randomization Trials in Health Research. Arnold, London.
  • Donner, A. and Klar, N. (2004). Pitfalls of and controversies in cluster randomization trials. American Journal of Public Health 94 416–422.
  • Feng, Z., Diehr, P., Peterson, A. and McLerran, D. (2001). Selected statistical issues in group randomized trials. Annual Review of Public Health 22 167–187.
  • Fisher, R. A. (1935). The Design of Experiments. Oliver and Boyd, London.
  • Frangakis, C. E., Rubin, D. B. and Zhou, X.-H. (2002). Clustered encouragement designs with individual noncompliance: Bayesian inference with randomization, and application to advance directive forms (with discussion). Biostatistics 3 147–164.
  • Frenk, J., Sepúlveda, J., Gómez-Dantés, O. and Knaul, F. (2003). Evidence-based health policy: Three generations of reform in Mexico. The Lancet 362 1667–1671.
  • Gail, M. H., Byar, D. P., Pechacek, T. F. and Corle, D. K. (1992). Aspects of statistical design for the community intervention trial for smoking cessation (COMMIT). Controlled Clinical Trials 13 16–21.
  • Gail, M. H., Mark, S. D., Carroll, R. J., Green, S. B. and Pee, D. (1996). On design considerations and randomization-based inference for community intervention trials. Statist. Med. 15 1069–1992.
  • Greevy, R., Lu, B., Silber, J. H. and Rosenbaum, P. (2004). Optimal multivariate matching before randomization. Biostatistics 5 263–275.
  • Hayes, R. and Bennett, S. (1999). Simple sample size calculation for cluster-randomized trials. International Journal of Epidemiology 28 319–326.
  • Higgins, J. and Green, S., eds. (2006). Cochrane Handbook for Systematic Review of Interventions 4.2.5 [Updated September 2006]. Wiley, Chichester, UK.
  • Hill, J. L., Rubin, D. B. and Thomas, N. (1999). The design of the New York school choice scholarship program evaluation. In Research Designs: Inspired by the Work of Donald Campbell (L. Bickman, ed.) 155–180. Sage, Thousand Oaks.
  • Holland, P. W. (1986). Statistics and causal inference. J. Amer. Statist. Assoc. 81 945–960.
  • Imai, K. (2008). Variance identification and efficiency analysis in randomized experiments under the matched-pair design. Statist. Med. 27 4857–4873.
  • Imai, K., King, G. and Stuart, E. A. (2008). Misunderstandings among experimentalists and observationalists about causal inference. J. Roy. Statist. Soc., Ser. A 171 481–502.
  • Imai, K., King, G. and Nall, C. (2009). Replication data for: The essential role of pair matching in cluster-randomized experiments, with application to the Mexican universal health insurance evaluation hdl:1902.1/11047 UNF:3:jeUN9XODtYUp2iUbe8gWZQ== Murray Research Archive [Distributor].
  • Kalton, G. (1968). Standardization: A technique to control for extraneous variables. Appl. Statist. 17 118–136.
  • King, G., Gakidou, E., Ravishankar, N., Moore, R. T., Lakin, J., Vargas, M., Téllez-Rojo, M. M., Ávila, J. E. H., Ávila, M. H. and Llamas, H. H. (2007). A ‘politically robust’ experimental design for public policy evaluation, with application to the Mexican universal health insurance program. Journal of Policy Analysis and Management 26 479–506. Available at
  • King, G., Gakidou, E., Imai, K., Lakin, J., Moore, R. T., Ravishankar, N., Vargas, M., Tèllez-Rojo, M. M., Ávila, J. E. H., Ávila, M. H. and Llamas, H. H. (2009). Public policy for the poor? A randomised assessment of the Mexican universal health insurance programme. The Lancet. To appear. Available at
  • Klar, N. and Donner, A. (1997). The merits of matching in community intervention trials: A cautionary tale. Statist. Med. 16 1753–1764.
  • Klar, N. and Donner, A. (1998). Author’s reply. Statist. Med. 17 2151–2152.
  • Maldonado, G. and Greenland, S. (2002). Estimating causal effects. International Journal of Epidemiology 31 422–429.
  • Martin, D. C., Diehr, P., Perrin, E. B. and Koepsell, T. D. (1993). The effect of matching on the power of randomized community intervention studies. Statist. Med. 12 329–338.
  • McLaughlan, G. and Peel, D. (2000). Finite Mixture Models. Wiley, New York.
  • Medical Research Council (2002). Cluster randomized trials: Methodological and ethical considerations. Technical report, MRC Clinical Trials Series. Available at
  • Moulton, L. (2004). Covariate-based constrained randomization of group-randomized trials. Clinical Trials 1 297.
  • Murray, D. M. (1998). Design and Analysis of Community Trials. Oxford Univ. Press, Oxford.
  • Neyman, J. (1923). On the application of probability theory to agricultural experiments: Essay on principles, section 9. Statist. Sci. 5 465–480. (Translated in 1990.)
  • Raudenbush, S. W. (1997). Statistical analysis and optimal design for cluster-randomized trials. Psychological Methods 2 173–185.
  • Raudenbush, S. W., Martinez, A. and Spybrook, J. (2007). Strategies for improving precision in group-randomized experiments. Educational Evaluation and Policy Analysis 29 5–29.
  • Rosenbaum, P. R. (2007). Interference between units in randomized experiments. J. Amer. Statist. Assoc. 102 191–200.
  • Rubin, D. B. (1990). Comments on “On the application of probability theory to agricultural experiments. Essay on principles. Section 9” by J. Splawa-Neyman translated from the Polish and edited by D. M. Dabrowska and T. P. Speed. Statist. Sci. 5 472–480.
  • Rubin, D. B. (1991). Practical implications of modes of statistical inference for causal effects and the critical role of the assignment mechanism. Biometrics 47 1213–1234.
  • Small, D., Ten Have, T. and Rosenbaum, P. (2008). Randomization inference in a group-randomized trial of treatments for depression: Covariate adjustment, noncompliance and quantile effects. J. Amer. Statist. Assoc. 103 271–279.
  • Snedecor, G. W. and Cochran, W. G. (1989). Statistical Methods, 8th ed. Iowa State Univ. Press, Ames, IA.
  • Sobel, M. E. (2006). What do randomized studies of housing mobility demonstrate?: Causal inference in the face of interference. J. Amer. Statist. Assoc. 101 1398–1407.
  • Sommer, A., Djunaedi, E., Loeden, A. A., Tarwotjo, I. J., West, K. P. and Tilden, R. (1986). Impact of vitamin A supplementation on childhood mortality, a randomized clinical trial. Lancet 1 1169–1173.
  • Thompson, S. G. (1998). Letter to the editor: The merits of matching in community intervention trials: A cautionary tale by N. Klar and A. Donner. Statist. Med. 17 2149–2151.
  • Turner, R. M., White, I. R. and Croudace, T. (2007). Analysis of cluster-randomized cross-over data. Statist. Med. 26 274–289.
  • Varnell, S., Murray, D., Janega, J. and Blitstein, J. (2004). Design and analysis of group-randomized trials: A review of recent practices. American Journal of Public Health 93 393–399.
  • Wei, L. J. (1982). Interval estimation of location difference with incomplete data. Biometrika 69 249–251.
  • What Works Clearinghouse (2006). Evidence standards for reviewing studies. Technical report, Institute for Educational Sciences. Available at