The receiver operating characteristic (ROC) curve is the most widely used measure for evaluating the discriminatory performance of a continuous biomarker. Incorporating covariates in the analysis can potentially enhance information gathered from the biomarker, as its discriminatory ability may depend on these. In this paper we propose a dependent Bayesian nonparametric model for conditional ROC estimation. Our model is based on dependent Dirichlet processes, where the covariate-dependent ROC curves are indirectly modeled using probability models for related probability distributions in the diseased and healthy groups. Our approach allows for the entire distribution in each group to change as a function of the covariates, provides exact posterior inference up to a Monte Carlo error, and can easily accommodate multiple continuous and categorical predictors. Simulation results suggest that, regarding the mean squared error, our approach performs better than its competitors for small sample sizes and nonlinear scenarios. The proposed model is applied to data concerning diagnosis of diabetes.
Bayesian Anal.
8(3):
623-646
(September 2013).
DOI: 10.1214/13-BA825
Barrientos, A. F., Jara, A., and Quintana, F. (2012). “On the support of MacEachern’s dependent Dirichlet processes and extensions.” Bayesian Analysis, 7: 277–310. MR2934952 10.1214/12-BA709 euclid.ba/1339878889
Barrientos, A. F., Jara, A., and Quintana, F. (2012). “On the support of MacEachern’s dependent Dirichlet processes and extensions.” Bayesian Analysis, 7: 277–310. MR2934952 10.1214/12-BA709 euclid.ba/1339878889
Blackwell, D. and MacQueen, J. (1973). “Ferguson distributions via Pólya urn schemes.” The Annals of Statistics, 1: 353–355. MR362614 10.1214/aos/1176342372 euclid.aos/1176342372
Blackwell, D. and MacQueen, J. (1973). “Ferguson distributions via Pólya urn schemes.” The Annals of Statistics, 1: 353–355. MR362614 10.1214/aos/1176342372 euclid.aos/1176342372
Branscum, A. J., Johnson, W. O., Hanson, T. E., and Gardner, I. A. (2008). “Bayesian semiparametric ROC curve estimation and disease diagnosis.” Statistics in Medicine, 27: 2474–2496. MR2432500 10.1002/sim.3250Branscum, A. J., Johnson, W. O., Hanson, T. E., and Gardner, I. A. (2008). “Bayesian semiparametric ROC curve estimation and disease diagnosis.” Statistics in Medicine, 27: 2474–2496. MR2432500 10.1002/sim.3250
De Iorio, M., Johnson, W. O., Müller, P., and Rosner, G. L. (2009). “Bayesian nonparametric non-proportional hazards survival modelling.” Biometrics, 65: 762–771. MR2649849 10.1111/j.1541-0420.2008.01166.xDe Iorio, M., Johnson, W. O., Müller, P., and Rosner, G. L. (2009). “Bayesian nonparametric non-proportional hazards survival modelling.” Biometrics, 65: 762–771. MR2649849 10.1111/j.1541-0420.2008.01166.x
De Iorio, M., Müller, P., Rosner, G. L., and MacEachern, S. N. (2004). “An ANOVA model for dependent random measures.” Journal of the American Statistical Association, 99: 205–215. MR2054299 1089.62513 10.1198/016214504000000205De Iorio, M., Müller, P., Rosner, G. L., and MacEachern, S. N. (2004). “An ANOVA model for dependent random measures.” Journal of the American Statistical Association, 99: 205–215. MR2054299 1089.62513 10.1198/016214504000000205
De la Cruz, R., Quintana, F. A., and Müller, P. (2007). “Semiparametric Bayesian classification with longitudinal markers.” Journal of the Royal Statistical Society, Ser. C, 56(2): 119–137. MR2359237 05188760 10.1111/j.1467-9876.2007.00569.xDe la Cruz, R., Quintana, F. A., and Müller, P. (2007). “Semiparametric Bayesian classification with longitudinal markers.” Journal of the Royal Statistical Society, Ser. C, 56(2): 119–137. MR2359237 05188760 10.1111/j.1467-9876.2007.00569.x
Dubey, S. (1970). “Compound gamma, beta and F distributions.” Metrika, 16: 27–31. MR312624 10.1007/BF02613934Dubey, S. (1970). “Compound gamma, beta and F distributions.” Metrika, 16: 27–31. MR312624 10.1007/BF02613934
Eilers, P. H. C. and Marx, B. D. (1996). “Flexible smoothing with B-splines and penalties.” Statistical Science, 11(2): 89–121. MR1435485 10.1214/ss/1038425655 euclid.ss/1038425655
Eilers, P. H. C. and Marx, B. D. (1996). “Flexible smoothing with B-splines and penalties.” Statistical Science, 11(2): 89–121. MR1435485 10.1214/ss/1038425655 euclid.ss/1038425655
Erkanli, A., Sung, M., Costello, E. J., and Angold, A. (2006). “Bayesian semiparametric ROC analysis.” Statistics in Medicine, 25: 3905–3928. MR2297400 10.1002/sim.2496Erkanli, A., Sung, M., Costello, E. J., and Angold, A. (2006). “Bayesian semiparametric ROC analysis.” Statistics in Medicine, 25: 3905–3928. MR2297400 10.1002/sim.2496
Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. London: Chapman & Hall. MR1383587 0873.62037Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. London: Chapman & Hall. MR1383587 0873.62037
Faraggi, D. (2003). “Adjusting receiver operating characteristic curves and related indices for covariates.” Journal of the Royal Statistical Society, Ser.D, 52: 1152–1174. MR1977259 10.1111/1467-9884.00350Faraggi, D. (2003). “Adjusting receiver operating characteristic curves and related indices for covariates.” Journal of the Royal Statistical Society, Ser.D, 52: 1152–1174. MR1977259 10.1111/1467-9884.00350
González-Manteiga, W., Pardo-Fernandéz, J. C., and Van Keilegom, I. (2011). “ROC curves in non-parametric location-scale regression models.” Scandinavian Journal of Statistics, 38: 169–184. MR2760145 10.1111/j.1467-9469.2010.00693.xGonzález-Manteiga, W., Pardo-Fernandéz, J. C., and Van Keilegom, I. (2011). “ROC curves in non-parametric location-scale regression models.” Scandinavian Journal of Statistics, 38: 169–184. MR2760145 10.1111/j.1467-9469.2010.00693.x
Hanson, T., Branscum, A., and Gardner, I. (2008a). “Multivariate mixtures of Polya trees for modelling ROC data.” Statistical Modelling, 8: 81–96. MR2750632 10.1177/1471082X0700800106Hanson, T., Branscum, A., and Gardner, I. (2008a). “Multivariate mixtures of Polya trees for modelling ROC data.” Statistical Modelling, 8: 81–96. MR2750632 10.1177/1471082X0700800106
Hanson, T., Kottas, A., and Branscum, A. J. (2008b). “Modelling stochastic order in the analysis of receiver operating characteristic data: Bayesian non-parametric approaches.” Journal of the Royal Statistical Society, Ser.C, 57: 207–225. MR2420437 05622195 10.1111/j.1467-9876.2007.00609.xHanson, T., Kottas, A., and Branscum, A. J. (2008b). “Modelling stochastic order in the analysis of receiver operating characteristic data: Bayesian non-parametric approaches.” Journal of the Royal Statistical Society, Ser.C, 57: 207–225. MR2420437 05622195 10.1111/j.1467-9876.2007.00609.x
Hsieh, F. and Turnbull, B. (1996). “Nonparametric and semiparametric estimation of the receiver operating characteristic curve.” The Annals of Statistics, 24: 24–40. MR1389878 0855.62029 10.1214/aos/1033066197 euclid.aos/1033066197
Hsieh, F. and Turnbull, B. (1996). “Nonparametric and semiparametric estimation of the receiver operating characteristic curve.” The Annals of Statistics, 24: 24–40. MR1389878 0855.62029 10.1214/aos/1033066197 euclid.aos/1033066197
Inácio, V., Turkman, A. A., Nakas, C. T., and Alonzo, T. A. (2011). “Nonparametric Bayesian estimation of the three-way receiver operating characteristic surface.” Biometrical Journal, 53: 1011–1024. MR2861524 1238.62036 10.1002/bimj.201100070Inácio, V., Turkman, A. A., Nakas, C. T., and Alonzo, T. A. (2011). “Nonparametric Bayesian estimation of the three-way receiver operating characteristic surface.” Biometrical Journal, 53: 1011–1024. MR2861524 1238.62036 10.1002/bimj.201100070
Jara, A., Hanson, T., Quintana, F., Müller, P., and Rosner, G. L. (2011). “DPpackage: Bayesian semi- and sonparametric modeling in R.” Journal of Statistical Software, 40: 1–30.Jara, A., Hanson, T., Quintana, F., Müller, P., and Rosner, G. L. (2011). “DPpackage: Bayesian semi- and sonparametric modeling in R.” Journal of Statistical Software, 40: 1–30.
Jara, A., Lesaffre, E., De Iorio, M., and Quintana, F. A. (2010). “Bayesian semiparametric inference for multivariate doubly-interval-censored data.” Annals of Applied Statistics, 4: 2126–2149. MR2829950 1220.62023 10.1214/10-AOAS368 euclid.aoas/1294167813
Jara, A., Lesaffre, E., De Iorio, M., and Quintana, F. A. (2010). “Bayesian semiparametric inference for multivariate doubly-interval-censored data.” Annals of Applied Statistics, 4: 2126–2149. MR2829950 1220.62023 10.1214/10-AOAS368 euclid.aoas/1294167813
Lloyd, C. J. (1998). “Using smooth receiver operating characteristic curves to summarize and compare diagnostic systems.” Journal of the American Statistical Association, 93: 1356–1364.Lloyd, C. J. (1998). “Using smooth receiver operating characteristic curves to summarize and compare diagnostic systems.” Journal of the American Statistical Association, 93: 1356–1364.
MacEachern, S. N. (1994). “Estimating normal means with a conjugate style Dirichlet process prior.” Communications in Statistics: Simulation and Computation, 23: 727–741. MR1293996 0825.62053 10.1080/03610919408813196MacEachern, S. N. (1994). “Estimating normal means with a conjugate style Dirichlet process prior.” Communications in Statistics: Simulation and Computation, 23: 727–741. MR1293996 0825.62053 10.1080/03610919408813196
MacEachern, S. N. and Müller, P. (1998). “Estimating mixture of Dirichlet process models.” Journal of Computational and Graphical Statistics, 7: 223–338.MacEachern, S. N. and Müller, P. (1998). “Estimating mixture of Dirichlet process models.” Journal of Computational and Graphical Statistics, 7: 223–338.
Muliere, P. and Tardella, L. (1998). “Approximating distributions of random functionals of Ferguson-Dirichlet priors.” The Canadian Journal of Statistics, 26: 283–297. MR1648431 10.2307/3315511Muliere, P. and Tardella, L. (1998). “Approximating distributions of random functionals of Ferguson-Dirichlet priors.” The Canadian Journal of Statistics, 26: 283–297. MR1648431 10.2307/3315511
Neal, R. (2000). “Markov chain sampling methods for Dirichlet process mixture models.” Journal of Computational and Graphical Statistics, 9: 249–265. MR1823804Neal, R. (2000). “Markov chain sampling methods for Dirichlet process mixture models.” Journal of Computational and Graphical Statistics, 9: 249–265. MR1823804
Peng, L. and Zhou, X. H. (2004). “Local linear smoothing of receiver operating characteristic (ROC) curves.” Journal of Statistical Planning and Inference, 118: 129–143. MR2015225 1031.62097 10.1016/S0378-3758(02)00394-4Peng, L. and Zhou, X. H. (2004). “Local linear smoothing of receiver operating characteristic (ROC) curves.” Journal of Statistical Planning and Inference, 118: 129–143. MR2015225 1031.62097 10.1016/S0378-3758(02)00394-4
Pepe, M. S. (1998). “Three approaches to regression analysis of receiver operating characteristic curves for continuous test results.” Biometrics, 54: 124–135.Pepe, M. S. (1998). “Three approaches to regression analysis of receiver operating characteristic curves for continuous test results.” Biometrics, 54: 124–135.
Richardson, S. and Green, P. J. (1997). “On Bayesian analysis of mixtures with an unknown number of components.” Journal of the Royal Statistical Society, Ser.B, 59: 731–792. MR1483213 10.1111/1467-9868.00095Richardson, S. and Green, P. J. (1997). “On Bayesian analysis of mixtures with an unknown number of components.” Journal of the Royal Statistical Society, Ser.B, 59: 731–792. MR1483213 10.1111/1467-9868.00095
Rodríguez-Álvarez, M. X., Roca-Pardiñas, J., and Cadarso-Suárez, C. (2011a). “ROC curve and covariates: extending the induced methodology to the non-parametric framework.” Statistics and Computing, 21: 483–495. MR2826687 1221.62147 10.1007/s11222-010-9184-1Rodríguez-Álvarez, M. X., Roca-Pardiñas, J., and Cadarso-Suárez, C. (2011a). “ROC curve and covariates: extending the induced methodology to the non-parametric framework.” Statistics and Computing, 21: 483–495. MR2826687 1221.62147 10.1007/s11222-010-9184-1
Rodríguez-Álvarez, M. X., Tahoces, P. C., Cadarso-Suárez, C., and Lado, M. J. (2011b). “Comparative study of ROC regression techniques—applications for the computer-aided diagnostic system in breast cancer detection.” Computational Statistics and Data Analysis, 55: 888–902. MR2736605Rodríguez-Álvarez, M. X., Tahoces, P. C., Cadarso-Suárez, C., and Lado, M. J. (2011b). “Comparative study of ROC regression techniques—applications for the computer-aided diagnostic system in breast cancer detection.” Computational Statistics and Data Analysis, 55: 888–902. MR2736605
Sarwar, N., Gao, P., Seshasai, S. R., Gobin, R., Kaptoge, S., Di Angelantonio, E., Ingelsson, E., Lawlor, D. A., Selvin, E., Stampfer, M., Stehouwer, C. D., Lewington, S., Pennells, L., Thompson, A., Sattar, N., White, I. R., Ray, K. K., and Danesh, J. (2010). “Diabetes mellitus fasting blood glucose concentration and risk of vascular disease: a collaborative meta-analysis of 102 prospective studies.” The Lancet, 375: 2215–2222.Sarwar, N., Gao, P., Seshasai, S. R., Gobin, R., Kaptoge, S., Di Angelantonio, E., Ingelsson, E., Lawlor, D. A., Selvin, E., Stampfer, M., Stehouwer, C. D., Lewington, S., Pennells, L., Thompson, A., Sattar, N., White, I. R., Ray, K. K., and Danesh, J. (2010). “Diabetes mellitus fasting blood glucose concentration and risk of vascular disease: a collaborative meta-analysis of 102 prospective studies.” The Lancet, 375: 2215–2222.
Smith, P. J. and Thompson, T. J. (1996). “Correcting for confounding in analyzing receiver operating characteristic curves.” Biometrical Journal, 7: 857–863.Smith, P. J. and Thompson, T. J. (1996). “Correcting for confounding in analyzing receiver operating characteristic curves.” Biometrical Journal, 7: 857–863.
Wild., S., Roghic, G., Green, A., Sicree, R., and King, H. (2004). “Global prevalence of diabetes: estimates for 2000 and projection for 2030.” Diabetes Care, 27: 1047–1053.Wild., S., Roghic, G., Green, A., Sicree, R., and King, H. (2004). “Global prevalence of diabetes: estimates for 2000 and projection for 2030.” Diabetes Care, 27: 1047–1053.
Xu, L., Hanson, T., Bedrick, E., and Restrepo, C. (2010). “Hypothesis tests on mixture model components with applications in ecology and agriculture.” Journal of Agricultural, Biological, and Environmental Statistics, 15: 308–326. MR2787261 1306.62365 10.1007/s13253-010-0020-zXu, L., Hanson, T., Bedrick, E., and Restrepo, C. (2010). “Hypothesis tests on mixture model components with applications in ecology and agriculture.” Journal of Agricultural, Biological, and Environmental Statistics, 15: 308–326. MR2787261 1306.62365 10.1007/s13253-010-0020-z
Zhou, X. H. and Harezlak, J. (2002). “Comparison of bandwidth selection methods for kernel smoothing of ROC curves.” Statistics in Medicine, 21: 2045–2055.Zhou, X. H. and Harezlak, J. (2002). “Comparison of bandwidth selection methods for kernel smoothing of ROC curves.” Statistics in Medicine, 21: 2045–2055.
Zou, K. H., Hall, W. J., and Shapiro, D. E. (1997). “Smooth nonparametric receiver operating characteristic (ROC) curves for continuous diagnostic tests.” Statistics in Medicine, 16: 2143–2156.Zou, K. H., Hall, W. J., and Shapiro, D. E. (1997). “Smooth nonparametric receiver operating characteristic (ROC) curves for continuous diagnostic tests.” Statistics in Medicine, 16: 2143–2156.