The Annals of Applied Statistics

Confident inference for SNP effects on treatment efficacy

Ying Ding, Ying Grace Li, Yushi Liu, Stephen J. Ruberg, and Jason C. Hsu

Full-text: Open access


Our research is for finding SNPs that are predictive of treatment efficacy, to decide which subgroup (with enhanced treatment efficacy) to target in drug development. Testing SNPs for lack of association with treatment outcome is inherently challenging, because any linkage disequilibrium between a noncausal SNP with a causal SNP, however small, makes the zero-null (no association) hypothesis technically false. Control of Type I error rate in testing such null hypotheses are therefore difficult to interpret. We propose a completely different formulation to address this problem. For each SNP, we provide simultaneous confidence intervals directed toward detecting possible dominant, recessive, or additive effects. Across the SNPs, we control the expected number of SNPs with at least one false confidence interval coverage. Since our confidence intervals are constructed based on pivotal statistics, the false coverage control is guaranteed to be exact and unaffected by the true values of test quantities (whether zero or nonzero). Our method is applicable to the therapeutic areas of Diabetes and Alzheimer’s diseases, and perhaps more, as a step toward confidently targeting a patient subgroup in a tailored drug development process.

Article information

Ann. Appl. Stat., Volume 12, Number 3 (2018), 1727-1748.

Received: May 2016
Revised: August 2017
First available in Project Euclid: 11 September 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Multiple testing simultaneous confidence intervals SNP tailored drug development treatment efficacy


Ding, Ying; Li, Ying Grace; Liu, Yushi; Ruberg, Stephen J.; Hsu, Jason C. Confident inference for SNP effects on treatment efficacy. Ann. Appl. Stat. 12 (2018), no. 3, 1727--1748. doi:10.1214/17-AOAS1128.

Export citation


  • Berger, R. L. and Hsu, J. C. (1996). Bioequivalence trials, intersection-union tests, and equivalence confidence sets. Statist. Sci. 11 283–315.
  • de Bakker, P., McVean, G., Sabeti, P., Miretti, M., Green, T., Marchini, J., Ke, X., Monsuur, A., Whittaker, P., Delgado, M., Morrison, J., Richardson, A., Walsh, E., Gao, X., Galver, L., Hart, J., Hafler, D., Pericak-Vance, M., Todd, J., Daly, M., Trowsdale, J., Wijmenga, C., Vyse, T., Beck, S., Murray, S., Carrington, M., Gregory, S., Deloukas, P. and Rioux, J. (2006). A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC. Nat. Genet. 38 1166–1172.
  • Efron, B. (2007). Correlation and large-scale simultaneous significance testing. J. Amer. Statist. Assoc. 102 93–103.
  • FDA (2005). Pharmacogenomic data submission: guidance for Industry. Center for Drug Evaluation and Research (CDER), Center for Biologics Evaluation and Research (CBER), Center for Devices and Radiological Health (CDRH), U.S. Food and Drug Administration, Rockville, MD.
  • FDA (2008). Guidance for Industry on Diabetes Mellitus: Developing Drugs and Therapeutic Biologics for Treatment and Prevention. Center for Drug Evaluation and Research (CDER), U.S. Food and Drug Administration, Rockville, MD.
  • Genz, A. and Bretz, F. (1999). Numerical computation of multivariate $t$-probabilities with application to power calculation of multiple contrasts. J. Stat. Comput. Simul. 63 361–378.
  • Hothorn, L. and Hothorn, T. (2009). Order-restricted scores test for the evaluation of population-based case-control studies when the genetic model is unknown. Biom. J. 51 659–669.
  • Hsu, J. C. (1996). Multiple Comparisons: Theory and Methods. Chapman & Hall, London.
  • Lettre, G., Lange, C. and Hirschhorn, J. (2007). Genetic model testing and statistical power in population-based association studies of quantitative traits. Genet. Epidemiol. 31 358–362.
  • Lipkovich, I., Dmeitrienko, A. and D’Agostino, R. B. (2017). Tutorial in biostatistics: Data-driven subgroup identification and analysis in clinical trials. Stat. Med. 36 136–196.
  • Loh, W.-Y., He, X. and Man, M. (2015). A regression tree approach to identifying subgroups with differential treatment effects. Stat. Med. 34 1818–1833.
  • Mallal, S., Nolan, D., Witt, C., Masel, G., Martin, A. M., Moore, C., Sayer, D., Castley, A., Mamotte, C., Maxwell, D., James, I. and Christiansen, F. T. (2002). Association between presence of HLA-B$^{*}$5701, HLA-DR7, and HLA-DQ3 and hypersensitivity to HIV-1 reverse-transcriptase inhibitor abacavir. Lancet 359 727–732.
  • Mallal, S., Phillips, E., Carosi, G., Molina, J.-M., Workman, C., Tomažič, J., Jägel-Guedes, E., Rugina, S., Kozyrev, O., Cid, J. F., Hay, P., Nolan, D., Hughes, S., Hughes, A., Ryan, S., Fitch, N., Thorborn, D. and Benbow, A. (2008). HLA-B$^{*}$5701 screening for hypersensitivity to abacavir. N. Engl. J. Med. 358 568–579.
  • So, H.-C. and Sham, P. C. (2011). Robust association tests under different genetic models, allowing for binary or quantitative traits and covariates. Behav. Genet. 41 768–775.
  • The 1000 Genomes Project Consortium (2010). A map of human genome variation from population-scale sequencing. Nature 467 1061–1073.
  • The 1000 Genomes Project Consortium (2012). An integrated map of genetic variation from 1092 human genomes. Nature 491 56–65.
  • The 1000 Genomes Project Consortium (2015). A global reference for human genetic variation. Nature 526 68–74.
  • Tukey, J. W. (1992). Where should multiple comparisons go next? In Multiple Comparisons, Selection, and Applications in Biometry: A Festschrift in Honor of Charles W. Dunnett (F. M. Hoppe, ed.) Chapter 12 187–208. Dekker, New York.