Statistical Science

Power prior distributions for regression models

Ming-Hui Chen and Joseph G. Ibrahim

Full-text: Open access

Abstract

We propose a general class of prior distributions for arbitrary regression models. We discuss parametric and semiparametric models. The prior specification for the regression coefficients focuses on observable quantities in that the elicitation is based on the availability of historical data $D_0$ and a scalar quantity $a_0$ quantifying the uncertainty in $D_0$. Then $D_0$ and $a_0$ are used to specify a prior for the regression coefficients in a semiautomatic fashion. The most natural specification of $D_0$ arises when the raw data from a similar previous study are available. The availability of historical data is quite common in clinical trials, carcinogenicity studies, and environmental studies, where large data bases are available from similar previous studies. Although the methodology we present here is quite general, we will focus only on using historical data from similar previous studies to construct the prior distributions. The prior distributions are based on the idea of raising the likelihood function of the historical data to the power $a_0$, where $0 \le a_0 \le 1$. We call such prior distributions power prior distributions. We examine the power prior for four commonly used classes of regression models. These include generalized linear models, generalized linear mixed models, semiparametric proportional hazards models, and cure rate models for survival data. For these classes of models, we discuss the construction of the power prior, prior elicitation issues, propriety conditions, model selection, and several other properties. For each class of models, we present real data sets to demonstrate the proposed methodology.

Article information

Source
Statist. Sci., Volume 15, Number 1 (2000), 46-60.

Dates
First available in Project Euclid: 24 December 2001

Permanent link to this document
https://projecteuclid.org/euclid.ss/1009212673

Digital Object Identifier
doi:10.1214/ss/1009212673

Mathematical Reviews number (MathSciNet)
MR1842236

Zentralblatt MATH identifier
0971.62036

Keywords
Cure rate model generalized linear model Gibbs sampling historical data prior elicitation model selection proportional hazards model random effects model

Citation

Ibrahim, Joseph G.; Chen, Ming-Hui. Power prior distributions for regression models. Statist. Sci. 15 (2000), no. 1, 46--60. doi:10.1214/ss/1009212673. https://projecteuclid.org/euclid.ss/1009212673


Export citation

References

  • Alden, C. J. (1994). Toxicology and carcinogenesis studies of o-benzyl-p-chlorophenol in F344/N rats and B6C3F1 mice. Technical Report NTP 424, U.S. Dept. Health and Human Services.
  • Berger, J. O. and Mallows, C. L. (1988). Discussion of Bayesian variable selection in linear regression. J. Amer. Statist. Assoc. 83 103-3-1034.
  • Chen, M.-H., Dey, D. K. and Sinha, D. (2000). Bayesian analysis of multivariate mortality data withlarge families. Appl. Statist. 49 129-144.
  • Chen, M.-H., Harrington, D. P. and Ibrahim, J. G. (1999). Bayesian models for high-risk melanoma: a case study of ECOG trial E1690. Technical Report MS-06-99-22, Dept. Mathematical Sciences, Worcester Polytechnic Inst.
  • Chen, M.-H., Ibrahim, J. G., Shao, Q.-M. and Weiss, R. E. (1999). Prior elicitation for model selection and estimation in generalized linear mixed models. Technical Report MS-01-99-17, Dept. Mathematical Sciences, Worcester Polytechnic Inst.
  • Chen, M.-H., Ibrahim, J. G. and Sinha, D. (1999). A new Bayesian model for survival data witha surviving fraction. J. Amer. Statist. Assoc. 94 909-919.
  • Chen, M.-H., Ibrahim, J. G. and Yiannoutsos, C. (1999). Prior elicitation, variable selection and Bayesian computation for logistic regression models. J. Roy. Statist. Soc. Ser. B 61 223-242.
  • Chen, M.-H., Manatunga, A. K. and Williams, C. J. (1998). Heritability estimates from human twin data by incorporating historical prior information. Biometrics 54 1348-1362.
  • Chen, M.-H. and Shao, Q.-M. (1999). Monte Carlo estimation of Bayesian credible and HPD intervals. J. Comput. Graph. Statist. 8 69-92.
  • Gelfand, A. E., Sahu, S. K. and Carlin, B. P. (1996). Efficient parametrisations for generalized linear mixed models (with discussion). In Bayesian Statistics 5 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 165-180. Oxford Univ. Press.
  • Haseman, J. K., Huff, J. and Boorman, G. A. (1984). Use of historical control data in carcinogenicity studies in rodents. Toxocologic Pathology 12 126-135.
  • Ibrahim J. G. and Laud, P. W. (1994). A predictive approachto the analysis of designed experiments. J. Amer. Statist. Assoc. 89 309-319.
  • Ibrahim, J. G. and Chen, M.-H. (1997). Predictive variable selection in the multivariate linear model. Biometrics 53 465-478.
  • Ibrahim, J. G. and Chen, M.-H. (1998). Prior distributions and Bayesian computation for proportional hazards models. Sankhy¯a Ser. B 60 48-64.
  • Ibrahim, J. G., Chen, M.-H. and Maceachern, S. N. (2000). Bayesian variable selection for proportional hazards models. Canad. J. Statist. To appear.
  • Ibrahim, J. G., Chen, M.-H. and Ryan, L. M. (2000). Bayesian variable selection for time series count data. Statist. Sinica. To appear.
  • Ibrahim, J. G., Ryan, L. M. and Chen M.-H. (1998). Use of historical controls to adjust for covariates in trend tests for binary data. J. Amer. Statist. Assoc. 93 1282-1293.
  • Krall, J. M., Utoff, V. A. and Harley, J. B. (1975). A step-up procedure for selecting variables associated withsurvival. Biometrics 31 49-57.
  • Laud, P. W. and Ibrahim, J. G. (1995). Predictive model selection. J. Roy. Statist. Soc. Ser. B 57 247-262. Merigan, T. C., Amato, D. A., Balsley, J., Power, M., Price, W. A., Beniot, S., Perez-Michael, A., Brownstein, A., Kramer, A. S., Brettler, D., Aledort, L., Ragni, M. V., Andes, A. W., Gill, J. C., Goldsmith, J., Stabler, S., Sanders, N., Gjerset, G., Usher, J. and the NHF-ACTG
  • 036 Study Group (1991). Placebo-controlled trial to evaluate zidovudine in treatment of human immunodeficiency virus infection in asymptomatic patients with hemophilia. Blood 78 900-906.
  • Mitchell, T. J. and Beauchamp, J. J. (1988). Bayesian variable selection in linear regression (withdiscussion). J. Amer. Statist. Assoc. 83 1023-1036.
  • Tsodikov, A. (1998). A proportional hazards model taking account of long-term survivors. Biometrics 54 1508-1516. Volberding, P. A., Lagakos, S. W., Koch, M. A., Pettinelli, C., Myers, M. W., Booth, D. K., Balfour, H. H., Reichman, R. C., Bartlett, J. A., Hirsch, M. S., Murphy, R. L., Hardy, D., Soeiro, R., Fischl, M. A., Bartlett, J. G., Merigan, T. C., Hyslop, N. E., Richman, D. D., Valentine, F. T., Corey, L. and the AIDS Clinical Trials Group of the National Institute of Allergy and Infectious Dis
  • eases (1990). Zidovudine in asymptomatic human immunodeficiency virus infection. New England J. Medicine 322 941-949.
  • Zellner, A. (1986). On assessing prior distributions and Bayesian regression analysis withg-prior distributions. In Studies in Bayesian Econometrics and Statistics (P. K. Goel and A. Zellner, eds.) 233-243. North-Holland, Amsterdam.