Statistical Science
- Statist. Sci.
- Volume 24, Number 2 (2009), 195-210.
Relaxation Penalties and Priors for Plausible Modeling of Nonidentified Bias Sources
Full-text: Access has been disabled (more information)
Abstract
In designed experiments and surveys, known laws or design features provide checks on the most relevant aspects of a model and identify the target parameters. In contrast, in most observational studies in the health and social sciences, the primary study data do not identify and may not even bound target parameters. Discrepancies between target and analogous identified parameters (biases) are then of paramount concern, which forces a major shift in modeling strategies. Conventional approaches are based on conditional testing of equality constraints, which correspond to implausible point-mass priors. When these constraints are not identified by available data, however, no such testing is possible. In response, implausible constraints can be relaxed into penalty functions derived from plausible prior distributions. The resulting models can be fit within familiar full or partial likelihood frameworks.
The absence of identification renders all analyses part of a sensitivity analysis. In this view, results from single models are merely examples of what might be plausibly inferred. Nonetheless, just one plausible inference may suffice to demonstrate inherent limitations of the data. Points are illustrated with misclassified data from a study of sudden infant death syndrome. Extensions to confounding, selection bias and more complex data structures are outlined.
Article information
Source
Statist. Sci. Volume 24, Number 2 (2009), 195-210.
Dates
First available in Project Euclid: 14 January 2010
Permanent link to this document
http://projecteuclid.org/euclid.ss/1263478381
Digital Object Identifier
doi:10.1214/09-STS291
Mathematical Reviews number (MathSciNet)
MR2655849
Zentralblatt MATH identifier
1328.62051
Keywords
Bias biostatistics causality epidemiology measurement error misclassification observational studies odds ratio relative risk risk analysis risk assessment selection bias validation
Citation
Greenland, Sander. Relaxation Penalties and Priors for Plausible Modeling of Nonidentified Bias Sources. Statist. Sci. 24 (2009), no. 2, 195--210. doi:10.1214/09-STS291. http://projecteuclid.org/euclid.ss/1263478381.
References
- Baker, S. G. (1996). The analysis of categorical case-control data subject to nonignorable nonresponse. Biometrics 52 362–369.Zentralblatt MATH: 1132.68649
- Bedrick, E. J., Christensen, R. and Johnson, W. (1996). A new perspective on generalized linear models. J. Amer. Statist. Assoc. 91 1450–1460.Mathematical Reviews (MathSciNet): MR1439085
Zentralblatt MATH: 0882.62057
Digital Object Identifier: doi:10.2307/2291571
JSTOR: links.jstor.org - Bishop, Y. M. M., Fienberg, S. E. and Holland, P. W. (1975). Discrete Multivariate Analysis: Theory and Practice. MIT Press, Cambridge, MA.Mathematical Reviews (MathSciNet): MR381130
- Box, G. E. P. (1980). Sampling and Bayes inference in scientific modeling and robustness. J. Roy. Statist. Soc. Ser. A 143 383–430.Mathematical Reviews (MathSciNet): MR603745
Digital Object Identifier: doi:10.2307/2982063
JSTOR: links.jstor.org - Bross, I. D. J. (1967). Pertinency of an extraneous variable. Journal of Chronic Diseases 20 487–495.
- Brumback, B. A., Hernan, M. A., Haneuse, S. and Robins, J. M. (2004). Sensitivity analyses for unmeasured confounding assuming a marginal structural model for repeated measures. Statist. Med. 23 749–767.
- Bull, S. B., Lewinger, J. B. and Lee, S. S. F. (2007). Confidence intervals for multinomial logistic regression in sparse data. Statist. Med. 26 903–918.
- Carroll, R. J., Ruppert, D., Stefanski, L. A. and Crainiceanu, C. (2006). Measurement Error in Nonlinear Models, 2nd ed. Chapman and Hall, Boca Raton, FL.Mathematical Reviews (MathSciNet): MR2243417
- Copas, J. B. (1999). What works? Selectivity models and meta-analysis. J. R. Stat. Soc. Ser. B 162 95–109.
- Cox, D. R. (1975). A note on partially Bayes inference and the linear model. Biometrika 62 651–654.Mathematical Reviews (MathSciNet): MR408131
Zentralblatt MATH: 0324.62029
Digital Object Identifier: doi:10.1093/biomet/62.3.651
JSTOR: links.jstor.org - Deely, J. J. and Lindley, D. V. (1981). Bayes empirical Bayes. J. Amer. Statist. Assoc. 76 833–841.Mathematical Reviews (MathSciNet): MR650894
Zentralblatt MATH: 0495.62009
Digital Object Identifier: doi:10.2307/2287578
JSTOR: links.jstor.org - Drews, C., Kraus, J. F. and Greenland, S. (1990). Recall bias in a case-control study of sudden infant death syndrome. International Journal of Epidemiology 19 405–411.
- Eddy, D. M., Hasselblad, V. and Shachter, R. (1992). Meta-Analysis by the Confidence Profile Method. Academic Press, New York.
- Espeland, M. and Hui, S. L. (1987). A general approach to analyzing epidemiologic data that contain misclassification errors. Biometrics 43 1001–1012.Mathematical Reviews (MathSciNet): MR879495
- Fortes, C., Mastroeni, S., Melchi, F., Pilla, M. A., Antonelli, G., Camaioni, D., Alotto, M. and Pasquini, P. (2008). A protective effect of the Mediterranean diet for cutaneous melanoma. International Journal of Epidemiology 37 1018–1029.
- Gelfand, A. E. and Sahu, S. K. (1999). Identifiability, improper priors, and Gibbs sampling for generalized linear models. J. Amer. Statist. Assoc. 94 247–253.Mathematical Reviews (MathSciNet): MR1689229
Zentralblatt MATH: 1072.62611
Digital Object Identifier: doi:10.2307/2669699
JSTOR: links.jstor.org - Geneletti, S., Ricequalityson, S. and Best, N. (2009). Adjusting for selection bias in retrospective case-control studies. Biostatistics 10 17–31.
- Good, I. J. (1983). Good Thinking. Univ. Minnesota Press, Minneapolis.
- Goubar, A., Aedes, A. E., DeAngelis, D., McGarrigle, C. A., Mercer, C. H., Tookey, P. A., Fenton, K. and Gill, O. N. (2008). Estimates of human immunodeficiency virus prevalence and proportion diagnosed based on Bayesian multiparameter synthesis of surveillance data (with discussion). J. Roy. Statist. Soc. Ser. A 171 541–580.Mathematical Reviews (MathSciNet): MR2432503
Digital Object Identifier: doi:10.1111/j.1467-985X.2007.00537.x - Greenland, S. (1992). A semi-Bayes approach to the analysis of correlated associations, with an application to an occupational cancer-mortality study. Statist. Med. 11 219–230.
- Greenland, S. (2000). When should epidemiologic regressions use random coefficients? Biometrics 56 915–921.
- Greenland, S. (2003a). The impact of prior distributions for uncontrolled confounding and response bias: A case study of the relation of wire codes and magnetic fields to childhood leukemia. J. Amer. Statist. Assoc. 98 47–54.Mathematical Reviews (MathSciNet): MR1977199
Zentralblatt MATH: 1047.62106
Digital Object Identifier: doi:10.1198/01621450338861905 - Greenland, S. (2003b). Generalized conjugate priors for Bayesian analysis of risk and survival regressions. Biometrics 59 92–99.Mathematical Reviews (MathSciNet): MR2012140
Digital Object Identifier: doi:10.1111/1541-0420.00011
JSTOR: links.jstor.org - Greenland, S. (2003c). Quantifying biases in causal models: Classical confounding versus collider-stratification bias. Epidemiology 14 300–306.
- Greenland, S. (2005a). Multiple-bias modeling for analysis of observational data (with discussion). J. Roy. Statist. Soc. Ser. A 168 267–308.Mathematical Reviews (MathSciNet): MR2119402
Zentralblatt MATH: 1099.62129
Digital Object Identifier: doi:10.1111/j.1467-985X.2004.00349.x
JSTOR: links.jstor.org - Greenland, S. (2005b). Contribution to discussion of Prentice, Pettinger, and Anderson. Biometrics 61 920–921.Mathematical Reviews (MathSciNet): MR2216182
Digital Object Identifier: doi:10.1111/j.0006-341X.2005.454_6.x - Greenland, S. (2006). Bayesian perspectives for epidemiologic research. I. Foundations and basic methods (with comment and reply). International Journal of Epidemiology 35 765–778.
- Greenland, S. (2007a). Bayesian perspectives for epidemiologic research. II. Regression analysis. International Journal of Epidemiology 36 195–202.
- Greenland, S. (2007b). Prior data for non-normal priors. Statist. Med. 26 3578–3590.
- Greenland, S. (2007c). Maximum-likelihood and closed-form estimators of epidemiologic measures under misclassification. J. Statist. Plann. Inference 138 528–538.Mathematical Reviews (MathSciNet): MR2412603
Digital Object Identifier: doi:10.1016/j.jspi.2007.06.012 - Greenland, S. (2009). Bayesian perspectives for epidemiologic research III. Bias analysis via missing data methods. International Journal of Epidemiology 38 1662–1673.
- Greenland, S., Gago-Domiguez, M. and Castellao, J. E. (2004). The value of risk-factor (“black-box”) epidemiology (with discussion). Epidemiology 15 519–535.
- Greenland, S. and Kheifets, L. (2006). Leukemia attributable to residential magnetic fields: Results from analyses allowing for study biases. Risk Analysis 26 471–482.
- Greenland, S. and Lash, T. L. (2008). Bias analysis. In Modern Epidemiology, 3rd ed. (K. J. Rothman, S. Greenland and T. L. Lash, eds.) Chapter 19, 345–380. Lippincott–Williams–Wilkins, Philadelphia.
- Greenland, S. and Maldonado, G. (1994). The interpretation of multiplicative model parameters as standardized parameters. Statist. Med. 13 989–999.
- Gustafson, P. (2003). Measurement Error and Misclassification in Statistics and Epidemiology: Impacts and Bayesian Adjustments. Chapman and Hall/CRC Press, Boca Raton.
- Gustafson, P. (2005). On model expansion, model contraction, identifiability, and prior information: Two illustrative scenarios involving mismeasured variables (with discussion). Statist. Sci. 20 111–140.Mathematical Reviews (MathSciNet): MR2183445
Digital Object Identifier: doi:10.1214/088342305000000098
Project Euclid: euclid.ss/1121347636 - Gustafson, P. and Greenland, S. (2006). The performance of random coefficient regression in accounting for residual confounding. Biometrics 62 760–768.Mathematical Reviews (MathSciNet): MR2247204
Digital Object Identifier: doi:10.1111/j.1541-0420.2005.00510.x - Gustafson, P. and Greenland, S. (2010). Interval estimation for messy observational data. To appear.
- Gustafson. P., Le, N. D. and Saskin, R. (2001). Case-control analysis with partial knowledge of exposure misclassification probabilities. Biometrics 57 598–609.Mathematical Reviews (MathSciNet): MR1855698
Digital Object Identifier: doi:10.1111/j.0006-341X.2001.00598.x
JSTOR: links.jstor.org - Hastie, T. and Tibshirani, R. (1990). Generalized Additive Models. Chapman and Hall, New York.Mathematical Reviews (MathSciNet): MR1082147
- Hastie, T., Tibshirani, R. and Friedman, J. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York.Mathematical Reviews (MathSciNet): MR1851606
- Higgins, J. P. T. and Spiegelhalter, D. J. (2002). Being skeptical about meta-analyses: A Bayesian perspective on magnesium trials in myocardial infarction. International Journal of Epidemiology 31 96–104, appendix.
- Hui, S. L. and Walter, S. D. (1980). Estimating the error rates of diagnostic tests. Biometrics 36 167–171.
- Johnson, W. O., Gastwirth, J. L. and Pearson, L. M. (2001). Screening without a “Gold Standard”: The Hui–Walter Paradigm revisited. American Journal of Epidemiology 153 921–924.
- Jones, M. C. (2004). Families of distributions arising from distributions of order statistics. Test 13 1–44.Mathematical Reviews (MathSciNet): MR2065642
Zentralblatt MATH: 1110.62012
Digital Object Identifier: doi:10.1007/BF02602999 - Joseph, L., Gyorkos, T. W. and Coupal, L. (1995). Bayesian estimation of disease prevalence and parameters for diagnostic tests in the absence of a gold standard. American Journal of Epidemiology 141 263–272.
- Kadane, J. B. (1993). Subjective Bayesian analysis for surveys with missing data. The Statistician 42 415–426. Erratum (1996): The Statistician 45 539.
- Kraus, J. F., Greenland, S. and Bulterys, M. G. (1989). Risk factors for sudden infant death syndrome in the U.S. Collaborative Perinatal Project. International Journal of Epidemiology 18 113–120.
- Lash, T. L. and Fink, A. K. (2003). Semi-automated sensitivity analysis to assess systematic errors in observational epidemiologic data. Epidemiology 14 451–458.
- Lawlor, D. A., Davey Smith, G., Bruckdorfer, K. R., Kundu, D. and Ebrahim, S. (2004). Those confounded vitamins: What can we learn from the differences between observational versus randomized trial evidence? Lancet 363 1724–1727.
- Leamer, E. E. (1974). False models and post-data model construction. J. Amer. Statist. Assoc. 69 122–131.
- Leonard, T. and Hsu, J. S. J. (1999). Bayesian Methods. Cambridge University Press, Cambridge.Mathematical Reviews (MathSciNet): MR1693571
- Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis with Missing Data, 2nd ed. Wiley, New York.Mathematical Reviews (MathSciNet): MR1925014
- Lyles, R. H. (2002). A note on estimating crude odds ratios in case-control studies with differentially misclassified exposure. Biometrics 58 1034–1037.Mathematical Reviews (MathSciNet): MR1945031
Digital Object Identifier: doi:10.1111/j.0006-341X.2002.1034_1.x
JSTOR: links.jstor.org - Maldonado, G. (2008). Adjusting a relative-risk estimate for study imperfections. Journal of Epidemiology and Community Health 62 655–663.
- McCandless, L. C., Gustafson, P. and Levy, A. (2007). Bayesian sensitivity analysis for unmeasured confounding in observational studies. Statist. Med. 26 2331–2347.
- McLachlan, G. J. and Krishnan, T. (1997). The EM Algorithm and Extensions. Wiley, New York.Mathematical Reviews (MathSciNet): MR1417721
- Messer, K. and Natarajan, L. (2008). Maximum likelihood, multiple imputation and regression calibration for measurement error adjustment. Statist. Med. 27 6332–6350.
- Molenberghs, G., Kenward, M. G. and Goetghebeur, E. (2001). Sensitivity analysis for incomplete contingency tables. Appl. Statist. 50 15–29.
- Molitor, J., Jackson, C., Best, N. B. and Ricequalityson, S. (2008). Using Bayesian graphical models to model biases in observational studies and to combine multiple data sources: Application to low birthweight and water disinfection by-products. J. Roy. Statist. Soc. Ser. A 172 615–638.
- Neath, A. A. and Samaniego, F. J. (1997). On the efficacy of Bayesian inference for nonidentifiable models. Amer. Statist. 51 225–232.Mathematical Reviews (MathSciNet): MR1467551
Digital Object Identifier: doi:10.2307/2684892
JSTOR: links.jstor.org - Phillips, C. V. (2003). Quantifying and reporting uncertainty from systematic errors. Epidemiology 14 459–466.
- Robins, J. M., Rotnitzky, A. and Scharfstein, D. O. (2000). Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In Statistical Models in Epidemiology, the Environment, and Clinical Trials (Minneapolis, MN, 1997). IMA Vol. Math. Appl. 116 1–94. Springer, New York.
- Rosenbaum, P. R. (1999). Choice as an alternative to control in observational studies (with discussion). Statist. Sci. 14 259–304.
- Rosenbaum, P. R. (2002). Observational Studies, 2nd ed. Springer, New York.Mathematical Reviews (MathSciNet): MR1899138
- Samaniego, F. J. and Neath, A. A. (1996). How to be a better Bayesian. J. Amer. Statist. Assoc. 91 733–742.Mathematical Reviews (MathSciNet): MR1395740
Zentralblatt MATH: 0869.62006
Digital Object Identifier: doi:10.2307/2291668
JSTOR: links.jstor.org - Scharfstein, D. O., Rotnitsky, A. and Robins, J. M. (1999). Adjusting for nonignorable drop-out using semiparametric nonresponse models. J. Amer. Statist. Assoc. 94 1096–1120.Mathematical Reviews (MathSciNet): MR1731478
Zentralblatt MATH: 1072.62644
Digital Object Identifier: doi:10.2307/2669923
JSTOR: links.jstor.org - Scharfstein, D. O., Daniels, M. J. and Robins, J. M. (2003). Incorporating prior beliefs about selection bias into the analysis of randomized trials with missing outcomes. Biostatistics 4 495–512.
- Small, D. R. and Rosenbaum, P. R. (2009). Error-free milestones in error-prone measurements. Ann. Appl. Statist. To appear.
- Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
- Titterington, D. M. (1985). Common structure of smoothing techniques in statistics. Internat. Statist. Rev. 53 141–170.Mathematical Reviews (MathSciNet): MR959035
Digital Object Identifier: doi:10.2307/1402932
JSTOR: links.jstor.org - Turner, R. M., Spiegelhalter, D. J., Smith, G. C. S. and Thompson, S. G. (2009). Bias modeling in evidence synthesis. J. Roy. Statist. Soc. Ser. A 172 21–47.
- Vansteelandt, S., Goetghebeur, E., Kenward, M. G. and Molenberghs, G. (2006). Ignorance and uncertainty regions as inferential tools in a sensitivity analysis. Statist. Sinica 16 953–980.
- Walker, A. M. (1982). Anamorphic analysis: Sampling and estimation for covariate effects when both exposure and disease are known. Biometrics 38 1025–1032.
- Welton, N. J., Ades, A. E., Carlin, J. B., Altman, D. G. and Sterne, J. B. (2009). Models for potentially biased evidence in meta-analysis using empirically based priors. J. Roy. Statist. Soc. Ser. A 172 119–136.
- Werler, M. M., Pober, B. R., Nelson, K. and Holmes, L. B. (1989). Reporting accuracy among mothers of malformed and nonmalformed infants. American Journal of Epidemiology 129 415–421.
- White, J. E. (1982). A two-stage design for the study of the relationship between a rare exposure and a rare disease. American Journal of Epidemiology 115 119–128.
- Yanagawa, T. (1984). Case-control studies: Assessing the effect of a confounding factor. Biometrika 71 191–194.Mathematical Reviews (MathSciNet): MR738341
Zentralblatt MATH: 0532.62087
Digital Object Identifier: doi:10.1093/biomet/71.1.191
JSTOR: links.jstor.org

- You have access to this content.
- You have partial access to this content.
- You do not have access to this content.
More like this
- On asymptotically optimal tests under loss of identifiability in semiparametric models
Song, Rui, Kosorok, Michael R., and Fine, Jason P., The Annals of Statistics, 2009 - A Bayesian graphical modeling approach to
microRNA regulatory network inference
Stingo, Francesco C., Chen, Yian A., Vannucci, Marina, Barrier, Marianne, and Mirkes, Philip E., The Annals of Applied Statistics, 2010 - Identifiability of a Markovian model of molecular evolution with Gamma-distributed rates
Allman, Elizabeth S., Ané, Cécile, and Rhodes, John A., Advances in Applied Probability, 2008
- On asymptotically optimal tests under loss of identifiability in semiparametric models
Song, Rui, Kosorok, Michael R., and Fine, Jason P., The Annals of Statistics, 2009 - A Bayesian graphical modeling approach to
microRNA regulatory network inference
Stingo, Francesco C., Chen, Yian A., Vannucci, Marina, Barrier, Marianne, and Mirkes, Philip E., The Annals of Applied Statistics, 2010 - Identifiability of a Markovian model of molecular evolution with Gamma-distributed rates
Allman, Elizabeth S., Ané, Cécile, and Rhodes, John A., Advances in Applied Probability, 2008 - Checking for prior-data conflict
Evans, Michael and Moshonov, Hadas, Bayesian Analysis, 2006 - External Validity: From Do-Calculus to Transportability Across Populations
Pearl, Judea and Bareinboim, Elias, Statistical Science, 2014 - Selection Models and the File Drawer Problem
Iyengar, Satish and Greenhouse, Joel B., Statistical Science, 1988 - On Model Expansion, Model Contraction, Identifiability and Prior Information: Two Illustrative Scenarios Involving Mismeasured Variables
Gustafson, Paul, Statistical Science, 2005 - Selection sampling from large data sets for targeted inference in mixture
modeling
Chan, Cliburn, Manolopoulou, Ioanna, and West, Mike, Bayesian Analysis, 2010 - Bayesian estimation of performance measures of screening tests in the presence of covariates and absence of a gold standard
Martinez, Edson Zangiacomi, Louzada-Neto, Francisco, Achcar, Jorge Alberto, Syrjänen, Kari Juhani, Derchain, Sophie Françoise Mauricette, Gontijo, Renata Clementino, and Sarian, Luis Otávio Zanatta, Brazilian Journal of Probability and Statistics, 2009 - Using GWAS Data to Identify Copy Number Variants Contributing to Common Complex Diseases
Zöllner, Sebastian and Teslovich, Tanya M., Statistical Science, 2009
