## The Annals of Applied Statistics

### Sensitivity analysis for stratified comparisons in an observational study of the effect of smoking on homocysteine levels

Paul R. Rosenbaum

#### Abstract

Sensitivity bounds for randomization inferences exist in several important cases, such as matched pairs with any type of outcome or binary outcomes with any type of stratification, but computationally feasible bounds for any outcome in any stratification are not currently available. For instance, with 20 strata, some large, others small, there is no currently available, computationally feasible sensitivity bound testing the null hypothesis of no treatment effect in the presence of a bias from nonrandom treatment assignment of a specific magnitude. The current paper solves the general problem; it uses an inequality formed by taking a one-step Taylor approximation from a near extreme solution, known as the separable approximation, where the concavity of the underlying function ensures that the Taylor approximation is, at worst, conservative. In practice, the separable approximation and the one-step movement away from it provide computationally feasible lower and upper bounds, thereby providing both a usable, perhaps slightly conservative statement, together with a check that the conservative statement is not unduly conservative. In every example that I have tried, the upper and lower bounds barely differ, although with some effort one can construct examples in which the separable approximation gives a $P$-value of 0.0499 and the Taylor approximation gives 0.0501. The new inequality holds in finite samples, so it strengthens certain existing asymptotic results, additionally simplifying the proof of those results. The method is discussed in the context of an observational study of the effects of smoking on homocysteine levels, a possible risk factor for several diseases including cardiovascular disease, thrombosis and Alzheimer’s disease. This study contains two evidence factors, the comparison of smokers and nonsmokers and the comparison of smokers to one another in terms of recent nicotine exposure. A new $\mathtt{R}$ package, $\mathtt{senstrat}$, implements the procedure and illustrates it with the example from the current paper.

#### Article information

Source
Ann. Appl. Stat., Volume 12, Number 4 (2018), 2312-2334.

Dates
Revised: April 2018
First available in Project Euclid: 13 November 2018

https://projecteuclid.org/euclid.aoas/1542078046

Digital Object Identifier
doi:10.1214/18-AOAS1153

Mathematical Reviews number (MathSciNet)
MR3875702

#### Citation

Rosenbaum, Paul R. Sensitivity analysis for stratified comparisons in an observational study of the effect of smoking on homocysteine levels. Ann. Appl. Stat. 12 (2018), no. 4, 2312--2334. doi:10.1214/18-AOAS1153. https://projecteuclid.org/euclid.aoas/1542078046

#### References

• Bazzano, L. A., He, J., Muntner, P., Vupputuri, S. and Whelton, P. K. (2003). Relationship between cigarette smoking and novel risk factors for cardiovascular disease in the United States. Ann. Intern. Med. 138 891–897.
• Bertsekas, D. P. (2009). Convex Optimization Theory. Athena Scientific, Nashua, NH.
• Bickel, P. J. and van Zwet, W. R. (1978). Asymptotic expansions for the power of distribution free tests in the two-sample problem. Ann. Statist. 6 937–1004.
• Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge Univ. Press, Cambridge.
• Centers for Disease Control (2016). Biomonitoring summary: Cotinine. CAS No. 486-56-6. Available at https://www.cdc.gov/biomonitoring/Cotinine_BiomonitoringSummary.html, dated December 27, 2016.
• Cornfield, J., Haenszel, W., Hammond, E. C., Lilienfeld, A. M., Shimkin, M. B. and Wynder, E. L. (1959). Smoking and lung cancer: Recent evidence and a discussion of some questions. J. Natl. Cancer Inst. 22 173–203.
• Egleston, B. L., Scharfstein, D. O. and MacKenzie, E. (2009). On estimation of the survivor average causal effect in observational studies when important confounders are missing due to death. Biometrics 65 497–504.
• Fisher, R. A. (1935). The Design of Experiments. Oliver & Boyd, Edinburgh.
• Fogarty, C. B. and Small, D. S. (2016). Sensitivity analysis for multiple comparisons in matched observational studies through quadratically constrained linear programming. J. Amer. Statist. Assoc. 111 1820–1830.
• Gastwirth, J. L., Krieger, A. M. and Rosenbaum, P. R. (2000). Asymptotic separability in sensitivity analysis. J. R. Stat. Soc. Ser. B. Stat. Methodol. 62 545–555.
• Gilbert, P. B., Bosch, R. J. and Hudgens, M. G. (2003). Sensitivity analysis for the assessment of causal vaccine effects on viral load in HIV vaccine trials. Biometrics 59 531–541.
• Hankey, G. J. and Eikelboom, J. W. (1999). Homocysteine and vascular disease. Lancet 354 407–413.
• Hansen, B. B. (2004). Full matching in an observational study of coaching for the SAT. J. Amer. Statist. Assoc. 99 609–618.
• Hodges, J. L. Jr. and Lehmann, E. L. (1962). Rank methods for combination of independent experiments in analysis of variance. Ann. Math. Stat. 33 482–497.
• Hosman, C. A., Hansen, B. B. and Holland, P. W. (2010). The sensitivity of linear regression coefficients’ confidence limits to the omission of a confounder. Ann. Appl. Stat. 4 849–870.
• Huber, P. J. (1981). Robust Statistics. Wiley, New York.
• Lehmann, E. L. (1975). Nonparametrics: Statistical Methods Based on Ranks. Holden-Day, San Francisco, CA.
• Lehmann, E. L. and Romano, J. P. (2005). Testing Statistical Hypotheses, 3rd ed. Springer Texts in Statistics. Springer, New York.
• Liu, W., Kuramoto, J. and Stuart, E. (2013). Sensitivity analysis for unobserved confounding in nonexperimental prevention research. Prev. Sci. 14 570–580.
• Mantel, N. (1963). Chi-square tests with one degree of freedom; extensions of the Mantel–Haenszel procedure. J. Amer. Statist. Assoc. 58 690–700.
• Maritz, J. S. (1979). A note on exact robust confidence intervals for location. Biometrika 66 163–166.
• Mehrotra, D. V., Lu, X. and Li, X. (2010). Rank-based analyses of stratified experiments: Alternatives to the van Elteren test. Amer. Statist. 64 121–130.
• Neyman, J. (1923). On the application of probability theory to agricultural experiments. Ann. Agric. Sci. 10 1–51. [Translated from the Polish and edited by D. M. Da̧browska and T. P. Speed in Statist. Sci. 5 (1990) 465–472. ]
• Pimentel, S. D., Small, D. S. and Rosenbaum, P. R. (2016). Constructed second control groups and attenuation of unmeasured biases. J. Amer. Statist. Assoc. 111 1157–1167.
• Puri, M. L. (1965). On the combination of independent two somple tests of a general class. Rev. Inst. Int. Stat. 33 229–241.
• Rosenbaum, P. R. (1991). A characterization of optimal designs for observational studies. J. Roy. Statist. Soc. Ser. B 53 597–610.
• Rosenbaum, P. R. (1995). Quantiles in nonrandom samples and observational studies. J. Amer. Statist. Assoc. 90 1424–1431.
• Rosenbaum, P. R. (2002a). Observational Studies, 2nd ed. Springer, New York.
• Rosenbaum, P. R. (2002b). Covariance adjustment in randomized experiments and observational studies. Statist. Sci. 17 286–327.
• Rosenbaum, P. R. (2014). Weighted $M$-statistics with superior design sensitivity in matched observational studies with multiple controls. J. Amer. Statist. Assoc. 109 1145–1158.
• Rosenbaum, P. R. (2017a). Observation and Experiment: An Introduction to Causal Inference. Harvard Univ. Press, Cambridge, MA.
• Rosenbaum, P. R. (2017b). The general structure of evidence factors in observational studies. Statist. Sci. 32 514–530.
• Rosenbaum, P. R. and Krieger, A. M. (1990). Sensitivity of two-sample permutation inferences in observational studies. J. Amer. Statist. Assoc. 85 493–498.
• Rosenbaum, P. R. and Small, D. S. (2017). An adaptive Mantel–Haenszel test for sensitivity analysis in observational studies. Biometrics 73 422–430.
• Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66 688–701.
• Satagopan, J. M., Offit, K., Foulkes, W., Robson, Wacholder S, M. E., Eng, C. M., Karp, S. E. and Begg, C. B. (2001). The lifetime risks of breast cancer in Ashkenazi Jewish carriers of brca1 and brca2 mutations. Cancer Epidemiol. Biomark. Prev. 10 467–473.
• Seshadri, S., Beiser, A., Selhub, J., Jacques, P. F., Rosenberg, I. H., D’agostino, R. B., Wilson, P. W. and Wolf, P. A. (2002). Plasma homocysteine as a risk factor for dementia and Alzheimer’s disease. N. Engl. J. Med. 346 476–483.
• Wald, D. S., Law, M. and Morris, J. K. (2002). Homocysteine and cardiovascular disease: Evidence on causality from a meta-analysis. Br. Med. J. 325 1202–1209.
• Welch, G. N. and Loscalzo, J. (1998). Homocysteine and atherothrombosis. N. Engl. J. Med. 338 1042–1050.
• Werfel, U., Langen, V., Eickhoff, I., Schoonbrood, J., Vahrenholz, C., Brauksiepe, A., Popp, W. and Norpoth, K. (1998). Elevated DNA single-strand breakage frequencies in lymphocytes of welders exposed to chromium and nickel. Carcinogenesis 19 413–418.
• Yu, B. B. and Gastwirth, J. L. (2005). Sensitivity analysis for trend tests: Application to the risk of radiation exposure. Biostatistics 6 201–209.