Statistical Science

The General Structure of Evidence Factors in Observational Studies

Paul R. Rosenbaum

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


The general structure of evidence factors is examined in terms of the knit product of two permutation groups. An observational or nonrandomized study of treatment effects has two evidence factors if it permits two (nearly) independent tests of the null hypothesis of no treatment effect and two (nearly) independent sensitivity analyses for those tests. Either of the two tests may be biased by nonrandom treatment assignment, but certain biases that would invalidate one test would have no impact on the other, so if the two tests concur, then some aspects of biased treatment assignment have been partially addressed. Expressed in terms of the knit product of two permutation groups, the structure of evidence factors is simpler and less cluttered, but at the same time more general and easier to apply in a new context. The issues are exemplified by an observational study of cigarette smoking as a cause of periodontal disease.

Article information

Statist. Sci., Volume 32, Number 4 (2017), 514-530.

First available in Project Euclid: 28 November 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Evidence factor knit product permutation group permutation inference randomization inference semidirect product sensitivity analysis wreath product Zappa–Szep product


Rosenbaum, Paul R. The General Structure of Evidence Factors in Observational Studies. Statist. Sci. 32 (2017), no. 4, 514--530. doi:10.1214/17-STS621.

Export citation


  • Alam, K. (1974). Some nonparametric tests of randomness. J. Amer. Statist. Assoc. 69 738–739.
  • Ateş, F. and Çevik, A. S. (2009). Knit products of some groups and their applications. Rend. Semin. Mat. Univ. Padova 121 1–11.
  • Austin, P. C. and Stuart, E. A. (2015). Optimal full matching for survival outcomes: A method that merits more widespread use. Stat. Med. 34 3949–3967.
  • Bailey, R. A., Praeger, C. E., Rowley, C. A. and Speed, T. P. (1983). Generalized wreath products of permutation groups. Proc. Lond. Math. Soc. (3) 47 69–82.
  • Bell, C. B. and Haller, H. S. (1969). Bivariate symmetry tests: Parametric and nonparametric. Ann. Math. Stat. 40 259–269.
  • Brannath, W., Posch, M. and Bauer, P. (2002). Recursive combination tests. J. Amer. Statist. Assoc. 97 236–244.
  • Brien, C. J. and Bailey, R. A. (2006). Multiple randomizations. J. R. Stat. Soc. Ser. B. Stat. Methodol. 68 571–609.
  • Centers for Disease Control (2016). Smoking, gum disease, and tooth loss. Available at
  • Cochran, W. G. (1965). The planning of observational studies of human populations (with discussion). J. Roy. Statist. Soc. Ser. A 128 234–266.
  • Conlon, J. C., Leon, R., Proschan, F. and Sethuraman, J. (1977). G-Ordered functions, with applications in statistics. I, II. Technical Report M432, M433, Dept. Statistics, Florida State Univ. Tallahassee, FL. Available at,
  • Cornfield, J., Haenszel, W., Hammond, E., Lilienfeld, A., Shimkin, M. and Wynder, E. (1959). Smoking and lung cancer. J. Nat. Cancer Inst. 22 173–203. Reprinted in Internat. J. Epidemiol. 38 (2009) 1175–1201. With discussion by D. R. Cox, J. Vandenbroucke, M. Zwahlen and J. B. Greenhouse.
  • Cox, D. R. and Reid, N. (2000). The Theory of the Design of Experiments. Chapman and Hall/CRC Press, London. DOI:10.1002/sim.1089.
  • Dawid, A. P. (1985). Invariance and independence in multivariate distribution theory. J. Multivariate Anal. 17 304–315.
  • Dawid, A. P. (1988). Symmetry models and hypotheses for structured data layouts. J. Roy. Statist. Soc. Ser. B 50 1–34.
  • Dwass, M. (1960). Some $k$-sample rank-order tests. In Contributions to Probability and Statistics 198–202. Stanford Univ. Press, Stanford, CA.
  • Eaton, M. L. (1982). A review of selected topics in multivariate probability inequalities. Ann. Statist. 10 11–43.
  • Eaton, M. L. and Perlman, M. D. (1977). Reflection groups, generalized Schur functions, and the geometry of majorization. Ann. Probab. 5 829–860.
  • Efron, B. (1971). Forcing a sequential experiment to be balanced. Biometrika 58 403–417.
  • Fisher, R. A. (1935). The Design of Experiments. Oliver & Boyd, Edinburgh.
  • Gastwirth, J. L. (1992). Methods for assessing the sensitivity of statistical comparisons used in Title VII cases to omitted variables. Jurimetrics 33 19–34.
  • Gilbert, N. D. and Wazzan, S. (2008). Zappa–Szép products of bands and groups. Semigroup Forum 77 438–455.
  • Hammond, E. C. (1964). Smoking in relation to mortality and morbidity: Findings in first thirty-four months of follow-up in a prospective study started in 1959. J. Natl. Cancer Inst. 32 1161–1188.
  • Hansen, B. B. and Klopfer, S. O. (2006). Optimal full matching and related designs via network flows. J. Comput. Graph. Statist. 15 609–627.
  • Hosman, C. A., Hansen, B. B. and Holland, P. W. (2010). The sensitivity of linear regression coefficients’ confidence limits to the omission of a confounder. Ann. Appl. Stat. 4 849–870.
  • Hsu, J. Y., Small, D. S. and Rosenbaum, P. R. (2013). Effect modification and design sensitivity in observational studies. J. Amer. Statist. Assoc. 108 135–148.
  • Huber, P. J. (1981). Robust Statistics. Wiley, New York.
  • Imbens, G. W. (2003). Sensitivity to exogeneity assumptions in program evaluation. Am. Econ. Rev. 93 126–132.
  • Isaacs, I. M. (2009). Algebra: A Graduate Course. Graduate Studies in Mathematics 100. Amer. Math. Soc., Providence, RI. Reprint of the 1994 original.
  • Lehmann, E. L. and Romano, J. P. (2005). Testing Statistical Hypotheses, 3rd ed. Springer, New York.
  • Liu, W., Kuramoto, S. J. and Stuart, E. A. (2013). An introduction to sensitivity analysis for unobserved confounding in nonexperimental prevention research. Prev. Sci. 14 570–580.
  • Marden, J. I. (1992). Use of nested orthogonal contrasts in analyzing rank data. J. Amer. Statist. Assoc. 87 307–318.
  • Maritz, J. S. (1979). A note on exact robust confidence intervals for location. Biometrika 66 163–166.
  • McCandless, L. C., Gustafson, P. and Levy, A. (2007). Bayesian sensitivity analysis for unmeasured confounding in observational studies. Stat. Med. 26 2331–2347.
  • Peirce, C. S. (1868). Some consequences of four incapacities. J. Specul. Philos. 2 140–157. Reprinted in R. B. Talisse and S. F. Aikin, eds. (2011). The Pragmatism Reader: From Peirce through the Present. Harvard Univ. Press, Cambridge, MA.
  • Pimentel, S. D., Yoon, F. and Keele, L. (2015). Variable-ratio matching with fine balance in a study of the Peer Health Exchange. Stat. Med. 34 4070–4082.
  • Randles, R. H. and Hogg, R. V. (1971). Certain uncorrelated statistics and independent rank statistics. J. Amer. Statist. Assoc. 66 569–574.
  • Roman, S. (2012). Fundamentals of Group Theory. An Advanced Approach. Birkhäuser/Springer, New York.
  • Rosenbaum, P. R. (1987). Sensitivity analysis for certain permutation inferences in matched observational studies. Biometrika 74 13–26.
  • Rosenbaum, P. R. (1991). A characterization of optimal designs for observational studies. J. Roy. Statist. Soc. Ser. B 53 597–610.
  • Rosenbaum, P. R. (1993). Hodges–Lehmann point estimates of treatment effect in observational studies. J. Amer. Statist. Assoc. 88 1250–1253.
  • Rosenbaum, P. R. (2001). Replicating effects and biases. Amer. Statist. 55 223–227.
  • Rosenbaum, P. R. (2002). Observational Studies, 2nd ed. Springer, New York.
  • Rosenbaum, P. R. (2007). Sensitivity analysis for $m$-estimates, tests, and confidence intervals in matched observational studies. Biometrics 63 456–464.
  • Rosenbaum, P. R. (2010a). Evidence factors in observational studies. Biometrika 97 333–345.
  • Rosenbaum, P. R. (2010b). Design sensitivity and efficiency in observational studies. J. Amer. Statist. Assoc. 105 692–702.
  • Rosenbaum, P. R. (2011). Some approximate evidence factors in observational studies. J. Amer. Statist. Assoc. 106 285–295.
  • Rosenbaum, P. R. (2013). Impact of multiple matched controls on design sensitivity in observational studies. Biometrics 69 118–127.
  • Rosenbaum, P. R. (2015a). How to see more in observational studies: Some new quasi-experimental devices. Ann. Rev. Statist. App. 2 21–48.
  • Rosenbaum, P. R. (2015b). Two R packages for sensitivity analysis in observational studies. Observ. Stud. 1 1–17.
  • Rosenbaum, P. R. (2016a). Using Scheffé projections for multiple outcomes in an observational study of smoking and periodontal disease. Ann. Appl. Stat. 10 1447–1471.
  • Rosenbaum, P. R. (2016b). The cross-cut statistic and its sensitivity to bias in observational studies with ordered doses of treatment. Biometrics 72 175–183.
  • Rosenbaum, P. R. (2017). Observation and Experiment. Harvard Univ. Press, Cambridge, MA.
  • Rosenbaum, P. R. and Silber, J. H. (2009). Amplification of sensitivity analysis in matched observational studies. J. Amer. Statist. Assoc. 104 1398–1405.
  • Rosenbaum, P. R. and Small, D. S. (2017). An adaptive Mantel–Haenszel test for sensitivity analysis in observational studies. Biometrics 73 422–430. DOI:10.1111/biom.12591.
  • Rotman, J. J. (1995). An Introduction to the Theory of Groups, 4th ed. Graduate Texts in Mathematics 148. Springer, New York.
  • Shepherd, B. E., Gilbert, P. B., Jemiai, Y. and Rotnitzky, A. (2006). Sensitivity analyses comparing outcomes only existing in a subset selected post-randomization, conditional on covariates, with application to HIV vaccine trials. Biometrics 62 332–342.
  • Stuart, E. A. and Green, K. M. (2008). Using full matching to estimate causal effects in nonexperimental studies: Examining the relationship between adolescent marijuana use and adult outcomes. Dev. Psychol. 44 395–406.
  • Susser, M. (1973). Causal Thinking in the Health Sciences: Concepts and Strategies in Epidemiology. Oxford Univ. Press, New York.
  • Susser, M. (1987). Falsification, verification and causal inference in epidemiology: Reconsideration in the light of Sir Karl Popper’s philosophy. In Epidemiology, Health and Society: Selected Papers (M. Susser, ed.) 82–93. Oxford Univ. Press, New York.
  • Szép, J. (1950). On the structure of groups which can be represented as the product of two subgroups. Acta Sci. Math. (Szeged) 12 57–61.
  • Tomar, S. L. and Asma, S. (2000). Smoking-attributable periodontitis in the United States: Findings from NHANES III. J. Periodont. 71 743–751.
  • Werfel, U., Langen, V., Eickhoff, I., Schoonbrood, J., Vahrenholz, C., Brauksiepe, A., Popp, W. and Norpoth, K. (1998). Elevated DNA strand breakage frequencies in lymphocytes of welders exposed to chromium and nickel. Carcinogenesis 19 413–418.
  • Wolfe, D. A. (1973). Some general results about uncorrelated statistics. J. Amer. Statist. Assoc. 68 1013–1018.
  • Yang, D., Small, D. S., Silber, J. H. and Rosenbaum, P. R. (2012). Optimal matching with minimal deviation from fine balance in a study of obesity and surgical outcomes. Biometrics 68 628–636.
  • Yu, B. B. and Gastwirth, J. L. (2005). Sensitivity analysis for trend tests: Application to the risk of radiation exposure. Biostatistics 6 201–209.
  • Zaykin, D. V., Zhivotovsky, L. A., Westfall, P. H. and Weir, B. S. (2002). Truncated product method for combining $P$-values. Genet. Epidemiol. 22 170–185. DOI:10.1002/gepi.0042.
  • Zhang, K., Small, D. S., Lorch, S., Srinivas, S. and Rosenbaum, P. R. (2011). Using split samples and evidence factors in an observational study of neonatal outcomes. J. Amer. Statist. Assoc. 106 511–524.
  • Zubizarreta, J. R., Neuman, M., Silber, J. H. and Rosenbaum, P. R. (2012). Contrasting evidence within and between institutions that provide treatment in an observational study of alternative forms of anesthesia. J. Amer. Statist. Assoc. 107 901–915.