Bayesian Analysis
- Bayesian Anal.
- Volume 8, Number 3 (2013), 623-646.
Bayesian Nonparametric ROC Regression Modeling
Vanda Inácio de Carvalho, Alejandro Jara, Timothy E. Hanson, and Miguel de Carvalho
Full-text: Open access
Abstract
The receiver operating characteristic (ROC) curve is the most widely used measure for evaluating the discriminatory performance of a continuous biomarker. Incorporating covariates in the analysis can potentially enhance information gathered from the biomarker, as its discriminatory ability may depend on these. In this paper we propose a dependent Bayesian nonparametric model for conditional ROC estimation. Our model is based on dependent Dirichlet processes, where the covariate-dependent ROC curves are indirectly modeled using probability models for related probability distributions in the diseased and healthy groups. Our approach allows for the entire distribution in each group to change as a function of the covariates, provides exact posterior inference up to a Monte Carlo error, and can easily accommodate multiple continuous and categorical predictors. Simulation results suggest that, regarding the mean squared error, our approach performs better than its competitors for small sample sizes and nonlinear scenarios. The proposed model is applied to data concerning diagnosis of diabetes.
Article information
Source
Bayesian Anal., Volume 8, Number 3 (2013), 623-646.
Dates
First available in Project Euclid: 9 September 2013
Permanent link to this document
https://projecteuclid.org/euclid.ba/1378729922
Digital Object Identifier
doi:10.1214/13-BA825
Mathematical Reviews number (MathSciNet)
MR3102228
Zentralblatt MATH identifier
1329.62154
Keywords
Conditional area under the curve related probability distributions dependent Dirichlet process Markov chain Monte Carlo
Citation
Inácio de Carvalho, Vanda; Jara, Alejandro; E. Hanson, Timothy; de Carvalho, Miguel. Bayesian Nonparametric ROC Regression Modeling. Bayesian Anal. 8 (2013), no. 3, 623--646. doi:10.1214/13-BA825. https://projecteuclid.org/euclid.ba/1378729922
References
- Alonzo, T. A. and Pepe, M. S. (2002). “Distribution-free ROC analysis using binary regression techniques.” Biostatistics, 3: 421–432.
- Barrientos, A. F., Jara, A., and Quintana, F. (2012). “On the support of MacEachern’s dependent Dirichlet processes and extensions.” Bayesian Analysis, 7: 277–310.Mathematical Reviews (MathSciNet): MR2934952
Digital Object Identifier: doi:10.1214/12-BA709
Project Euclid: euclid.ba/1339878889 - Blackwell, D. and MacQueen, J. (1973). “Ferguson distributions via Pólya urn schemes.” The Annals of Statistics, 1: 353–355.Mathematical Reviews (MathSciNet): MR362614
Digital Object Identifier: doi:10.1214/aos/1176342372
Project Euclid: euclid.aos/1176342372 - Branscum, A. J., Johnson, W. O., Hanson, T. E., and Gardner, I. A. (2008). “Bayesian semiparametric ROC curve estimation and disease diagnosis.” Statistics in Medicine, 27: 2474–2496.
- Cai, T. (2004). “Semiparametric ROC regression analysis with placement values.” Biostatistics, 5: 45–60.
- De Iorio, M., Johnson, W. O., Müller, P., and Rosner, G. L. (2009). “Bayesian nonparametric non-proportional hazards survival modelling.” Biometrics, 65: 762–771.Mathematical Reviews (MathSciNet): MR2649849
Digital Object Identifier: doi:10.1111/j.1541-0420.2008.01166.x - De Iorio, M., Müller, P., Rosner, G. L., and MacEachern, S. N. (2004). “An ANOVA model for dependent random measures.” Journal of the American Statistical Association, 99: 205–215.Mathematical Reviews (MathSciNet): MR2054299
Zentralblatt MATH: 1089.62513
Digital Object Identifier: doi:10.1198/016214504000000205 - De la Cruz, R., Quintana, F. A., and Müller, P. (2007). “Semiparametric Bayesian classification with longitudinal markers.” Journal of the Royal Statistical Society, Ser. C, 56(2): 119–137.Mathematical Reviews (MathSciNet): MR2359237
Zentralblatt MATH: 05188760
Digital Object Identifier: doi:10.1111/j.1467-9876.2007.00569.x - Dubey, S. (1970). “Compound gamma, beta and F distributions.” Metrika, 16: 27–31.
- Eilers, P. H. C. and Marx, B. D. (1996). “Flexible smoothing with B-splines and penalties.” Statistical Science, 11(2): 89–121.Mathematical Reviews (MathSciNet): MR1435485
Digital Object Identifier: doi:10.1214/ss/1038425655
Project Euclid: euclid.ss/1038425655 - Erkanli, A., Sung, M., Costello, E. J., and Angold, A. (2006). “Bayesian semiparametric ROC analysis.” Statistics in Medicine, 25: 3905–3928.
- Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. London: Chapman & Hall.
- Faraggi, D. (2003). “Adjusting receiver operating characteristic curves and related indices for covariates.” Journal of the Royal Statistical Society, Ser.D, 52: 1152–1174.
- González-Manteiga, W., Pardo-Fernandéz, J. C., and Van Keilegom, I. (2011). “ROC curves in non-parametric location-scale regression models.” Scandinavian Journal of Statistics, 38: 169–184.Mathematical Reviews (MathSciNet): MR2760145
Digital Object Identifier: doi:10.1111/j.1467-9469.2010.00693.x - Hanson, T., Branscum, A., and Gardner, I. (2008a). “Multivariate mixtures of Polya trees for modelling ROC data.” Statistical Modelling, 8: 81–96.Mathematical Reviews (MathSciNet): MR2750632
Digital Object Identifier: doi:10.1177/1471082X0700800106 - Hanson, T., Kottas, A., and Branscum, A. J. (2008b). “Modelling stochastic order in the analysis of receiver operating characteristic data: Bayesian non-parametric approaches.” Journal of the Royal Statistical Society, Ser.C, 57: 207–225.Mathematical Reviews (MathSciNet): MR2420437
Zentralblatt MATH: 05622195
Digital Object Identifier: doi:10.1111/j.1467-9876.2007.00609.x - Hsieh, F. and Turnbull, B. (1996). “Nonparametric and semiparametric estimation of the receiver operating characteristic curve.” The Annals of Statistics, 24: 24–40.Mathematical Reviews (MathSciNet): MR1389878
Zentralblatt MATH: 0855.62029
Digital Object Identifier: doi:10.1214/aos/1033066197
Project Euclid: euclid.aos/1033066197 - Inácio, V., Turkman, A. A., Nakas, C. T., and Alonzo, T. A. (2011). “Nonparametric Bayesian estimation of the three-way receiver operating characteristic surface.” Biometrical Journal, 53: 1011–1024.Mathematical Reviews (MathSciNet): MR2861524
Zentralblatt MATH: 1238.62036
Digital Object Identifier: doi:10.1002/bimj.201100070 - Jara, A. (2007). “Applied Bayesian non- and semi-parametric inference using DPpackage.” Rnews, 7: 17–26.
- Jara, A., Hanson, T., Quintana, F., Müller, P., and Rosner, G. L. (2011). “DPpackage: Bayesian semi- and sonparametric modeling in R.” Journal of Statistical Software, 40: 1–30.
- Jara, A., Lesaffre, E., De Iorio, M., and Quintana, F. A. (2010). “Bayesian semiparametric inference for multivariate doubly-interval-censored data.” Annals of Applied Statistics, 4: 2126–2149.Mathematical Reviews (MathSciNet): MR2829950
Zentralblatt MATH: 1220.62023
Digital Object Identifier: doi:10.1214/10-AOAS368
Project Euclid: euclid.aoas/1294167813 - Lloyd, C. J. (1998). “Using smooth receiver operating characteristic curves to summarize and compare diagnostic systems.” Journal of the American Statistical Association, 93: 1356–1364.
- MacEachern, S. N. (1994). “Estimating normal means with a conjugate style Dirichlet process prior.” Communications in Statistics: Simulation and Computation, 23: 727–741.Mathematical Reviews (MathSciNet): MR1293996
Zentralblatt MATH: 0825.62053
Digital Object Identifier: doi:10.1080/03610919408813196 - — (2000). “Dependent Dirichlet processes.” Technical report, Department of Statistics, The Ohio State University.
- MacEachern, S. N. and Müller, P. (1998). “Estimating mixture of Dirichlet process models.” Journal of Computational and Graphical Statistics, 7: 223–338.
- Muliere, P. and Tardella, L. (1998). “Approximating distributions of random functionals of Ferguson-Dirichlet priors.” The Canadian Journal of Statistics, 26: 283–297.
- Neal, R. (2000). “Markov chain sampling methods for Dirichlet process mixture models.” Journal of Computational and Graphical Statistics, 9: 249–265.Mathematical Reviews (MathSciNet): MR1823804
- Peng, L. and Zhou, X. H. (2004). “Local linear smoothing of receiver operating characteristic (ROC) curves.” Journal of Statistical Planning and Inference, 118: 129–143.Mathematical Reviews (MathSciNet): MR2015225
Zentralblatt MATH: 1031.62097
Digital Object Identifier: doi:10.1016/S0378-3758(02)00394-4 - Pepe, M. S. (1998). “Three approaches to regression analysis of receiver operating characteristic curves for continuous test results.” Biometrics, 54: 124–135.
- — (2003). The Statistical Evaluation of Medical Tests for Classification and Prediction. New York: Oxford University Press.Mathematical Reviews (MathSciNet): MR2260483
- R Development Core Team (2012). R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing.
- Richardson, S. and Green, P. J. (1997). “On Bayesian analysis of mixtures with an unknown number of components.” Journal of the Royal Statistical Society, Ser.B, 59: 731–792.
- Rodríguez-Álvarez, M. X., Roca-Pardiñas, J., and Cadarso-Suárez, C. (2011a). “ROC curve and covariates: extending the induced methodology to the non-parametric framework.” Statistics and Computing, 21: 483–495.Mathematical Reviews (MathSciNet): MR2826687
Zentralblatt MATH: 1221.62147
Digital Object Identifier: doi:10.1007/s11222-010-9184-1 - Rodríguez-Álvarez, M. X., Tahoces, P. C., Cadarso-Suárez, C., and Lado, M. J. (2011b). “Comparative study of ROC regression techniques—applications for the computer-aided diagnostic system in breast cancer detection.” Computational Statistics and Data Analysis, 55: 888–902.Mathematical Reviews (MathSciNet): MR2736605
- Sarwar, N., Gao, P., Seshasai, S. R., Gobin, R., Kaptoge, S., Di Angelantonio, E., Ingelsson, E., Lawlor, D. A., Selvin, E., Stampfer, M., Stehouwer, C. D., Lewington, S., Pennells, L., Thompson, A., Sattar, N., White, I. R., Ray, K. K., and Danesh, J. (2010). “Diabetes mellitus fasting blood glucose concentration and risk of vascular disease: a collaborative meta-analysis of 102 prospective studies.” The Lancet, 375: 2215–2222.
- Sethuraman, J. (1994). “A constructive definition of Dirichlet priors.” Statistica Sinica, 2: 639–650.Mathematical Reviews (MathSciNet): MR1309433
- Smith, P. J. and Thompson, T. J. (1996). “Correcting for confounding in analyzing receiver operating characteristic curves.” Biometrical Journal, 7: 857–863.
- Wild., S., Roghic, G., Green, A., Sicree, R., and King, H. (2004). “Global prevalence of diabetes: estimates for 2000 and projection for 2030.” Diabetes Care, 27: 1047–1053.
- Xu, L., Hanson, T., Bedrick, E., and Restrepo, C. (2010). “Hypothesis tests on mixture model components with applications in ecology and agriculture.” Journal of Agricultural, Biological, and Environmental Statistics, 15: 308–326.Mathematical Reviews (MathSciNet): MR2787261
Zentralblatt MATH: 1306.62365
Digital Object Identifier: doi:10.1007/s13253-010-0020-z - Zhou, X. H. and Harezlak, J. (2002). “Comparison of bandwidth selection methods for kernel smoothing of ROC curves.” Statistics in Medicine, 21: 2045–2055.
- Zou, K. H., Hall, W. J., and Shapiro, D. E. (1997). “Smooth nonparametric receiver operating characteristic (ROC) curves for continuous diagnostic tests.” Statistics in Medicine, 16: 2143–2156.

- You have access to this content.
- You have partial access to this content.
- You do not have access to this content.
More like this
- A Bayesian predictive model for imaging genetics with application to schizophrenia
Chekouo, Thierry, Stingo, Francesco C., Guindani, Michele, and Do, Kim-Anh, Annals of Applied Statistics, 2016 - Logistic regression analysis with standardized markers
Huang, Ying, Pepe, Margaret S., and Feng, Ziding, Annals of Applied Statistics, 2013 - Consistent Group Selection with Bayesian High Dimensional Modeling
Yang, Xinming and Narisetty, Naveen N., Bayesian Analysis, 2020
- A Bayesian predictive model for imaging genetics with application to schizophrenia
Chekouo, Thierry, Stingo, Francesco C., Guindani, Michele, and Do, Kim-Anh, Annals of Applied Statistics, 2016 - Logistic regression analysis with standardized markers
Huang, Ying, Pepe, Margaret S., and Feng, Ziding, Annals of Applied Statistics, 2013 - Consistent Group Selection with Bayesian High Dimensional Modeling
Yang, Xinming and Narisetty, Naveen N., Bayesian Analysis, 2020 - Early diagnosis of neurological disease using peak degeneration ages of multiple biomarkers
Gao, Fei, Wang, Yuanjia, and Zeng, Donglin, Annals of Applied Statistics, 2019 - Nonparametric and semiparametric estimation of the receiver operating characteristic curve
Hsieh, Fushing and Turnbull, Bruce W., Annals of Statistics, 1996 - Bayesian mixed effects models for zero-inflated compositions in microbiome data analysis
Ren, Boyu, Bacallado, Sergio, Favaro, Stefano, Vatanen, Tommi, Huttenhower, Curtis, and Trippa, Lorenzo, Annals of Applied Statistics, 2020 - Bayesian Zero-Inflated Negative Binomial Regression Based on Pólya-Gamma Mixtures
Neelon, Brian, Bayesian Analysis, 2019 - A Bayesian Semiparametric Temporally-Stratified Proportional Hazards Model with
Spatial Frailties
Hanson, Timothy E., Jara, Alejandro, and Zhao, Luping, Bayesian Analysis, 2012 - Function-on-scalar quantile regression with application to mass spectrometry proteomics data
Liu, Yusha, Li, Meng, and Morris, Jeffrey S., Annals of Applied Statistics, 2020 - Dynamic prediction of disease progression for leukemia patients by functional principal component analysis of longitudinal expression levels of an oncogene
Yan, Fangrong, Lin, Xiao, and Huang, Xuelin, Annals of Applied Statistics, 2017
