The Annals of Applied Statistics

Bayesian hierarchical rule modeling for predicting medical conditions

Tyler H. McCormick, Cynthia Rudin, and David Madigan
Source: Ann. Appl. Stat. Volume 6, Number 2 (2012), 652-668.

Abstract

We propose a statistical modeling technique, called the Hierarchical Association Rule Model (HARM), that predicts a patient’s possible future medical conditions given the patient’s current and past history of reported conditions. The core of our technique is a Bayesian hierarchical model for selecting predictive association rules (such as “condition 1 and condition 2 → condition 3”) from a large set of candidate rules. Because this method “borrows strength” using the conditions of many similar patients, it is able to provide predictions specialized to any given patient, even when little information about the patient’s history of conditions is available.

First Page: Show Hide

Related Works:

Full-text: Access denied (no subscription detected)
In 2007, access to the Annals of Applied Statistics was open. Beginning in 2008, you must hold a subscription or be a member of the IMS to view the full journal. For more information on subscribing, please visit: http://imstat.org/orders.
If you are already an IMS member, you may need to update your Euclid profile following the instructions here: http://imstat.org/publications/eaccess.htm.
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aoas/1339419611
Digital Object Identifier: doi:10.1214/11-AOAS522
Zentralblatt MATH identifier: 06062734
Mathematical Reviews number (MathSciNet): MR2976486

References

Agarwal, D., Zhang, L. and Mazumder, R. (2012). Modeling item–item similarities for personalized recommendations on Yahoo! front page. Ann. Appl. Stat. To appear.
Mathematical Reviews (MathSciNet): MR2884924
Zentralblatt MATH: 1231.62207
Digital Object Identifier: doi:10.1214/11-AOAS475
Project Euclid: euclid.aoas/1318514287
Agrawal, R., Imieliński, T. and Swami, A. (1993). Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data 207–216. ACM, New York, NY, USA.
Berchtold, A. and Raftery, A. E. (2002). The mixture transition distribution model for high-order Markov chains and non-Gaussian time series. Statist. Sci. 17 328–356.
Mathematical Reviews (MathSciNet): MR1962488
Digital Object Identifier: doi:10.1214/ss/1042727943
Project Euclid: euclid.ss/1042727943
Breese, J. S., Heckerman, D. and Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the Fourteenth Conference on Uncertainty and Artificial Intelligence 43–52. Morgan Kaufmann, San Francisco, CA.
Condliff, M. K., Lewis, D. D., Madigan, D. and Posse, C. (1999). Bayesian mixed-effects models for recommender systems. In Proceedings of the ACM SIGIR Workshop on Recommender Systems: Algorithms and Evaluation 23–30. ACM Press, New York.
Davis, D. A., Chawla, N. V., Christakis, N. A. and Barabási, A.-L. (2010). Time to CARE: A collaborative engine for practical disease prediction. Data Min. Knowl. Discov. 20 388–415.
Mathematical Reviews (MathSciNet): MR2608985
Digital Object Identifier: doi:10.1007/s10618-009-0156-z
DuMouchel, W. and Pregibon, D. (2001). Empirical Bayes screening for multi-item associations. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 67–76. ACM Press, New York.
Fraley, C. and Raftery, A. E. (2002). Model-based clustering, discriminant analysis, and density estimation. J. Amer. Statist. Assoc. 97 611–631.
Mathematical Reviews (MathSciNet): MR1951635
Zentralblatt MATH: 1073.62545
Digital Object Identifier: doi:10.1198/016214502760047131
Geng, L. and Hamilton, H. J. (2007). Choosing the right lens: Finding what is interesting in data mining. In Quality Measures in Data Mining 3–24. Springer, Berlin.
Gopalakrishnan, V., Lustgarten, J. L., Visweswaran, S. and Cooper, G. F. (2010). Bayesian rule learning for biomedical data mining. Bioinformatics 26 668–675.
Hood, L. and Friend, S. H. (2011). Predictive, personalized, preventive, participatory (P4) cancer medicine. Nat. Rev. Clin. Oncol. 8 184–187.
Kukline, E., Yoon, P. W. and Keenan, N. L. (2010). Prevalence of coronary heart disease risk factors and screening for high cholesterol levels among young adults in the United States, 1999–2006. Annals of Family Medicine 8 327–333.
Letham, B., Rudin, C. and Madigan, D. (2011). Sequential event prediction. Working Paper OR 387-11, MIT Operations Research Center.
McCormick, T., Rudin, C. and Madigan, D. (2011). Supplement to “Bayesian hierarchical rule modeling for predicting medical conditions.” DOI:10.1214/11-AOAS522SUPP.
Mathematical Reviews (MathSciNet): MR2976486
Zentralblatt MATH: 1243.62036
Piatetsky-Shapiro, G. (1991). Discovery, analysis and presentation of strong rules. In Knowledge Discovery in Databases (G. Piatetsky-Shapiro and W. J. Frawley, eds.) 229–248. AAAI/MIT Press.
Zentralblatt MATH: 0825.68361
Rosamond, W., Flegal, K., Friday, G., Furie, K., Go, A., Greenlund, K., Haase, N., Ho, M., Howard, V., Kissela, B., Kittner, S., Lloyd-Jones, D., McDermott, M., Meigs, J., Moy, C., Nichol, G., O’Donnell, C. J., Roger, V., Rumsfeld, J., Sorlie, P., Steinberger, J., Thom, T., Wasserthiel-Smoller, S. and Hong, Y. (2007). Heart disease and stroke statistics—2007 update: A report from the American heart association statistics committee and stroke statistics subcommittee. Circulation 115 e69–e171.
Rudin, C., Letham, B., Kogan, E. and Madigan, D. (2011a). A learning theory framework for association rules and sequential events. SSRN ELibrary.
Rudin, C., Letham, B., Salleb-Aouissi, A., Kogan, E. and Madigan, D. (2011b). Sequential event prediction with association rules. In Proceedings of the 24th Annual Conference on Learning Theory (COLT).
Shmueli, G. (2010). To explain or to predict? Statist. Sci. 25 289–310.
Mathematical Reviews (MathSciNet): MR2791669
Digital Object Identifier: doi:10.1214/10-STS330
Project Euclid: euclid.ss/1294167961
Tan, P. N., Kumar, V. and Srivastava, J. (2002). Selecting the right interestingness measure for association patterns. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, New York.
Vogenberg, F. R. (2009). Predictive and prognostic models: Implications for healthcare decision-making in a modern recession. American Health and Drug Benefits 6 218–222.
Willey, J. Z., Rodriguez, C. J., Carlino, R. F., Moon, Y. P., Paik, M. C., Boden-Albala, B., Sacco, R. L., DiTullio, M. R., Homma, S. and Elkind, M. S. V. (2011). Race-ethnic differences in the association between lipid profile components and risk of myocardial infarction: The Northern Manhattan Study. Am. Heart J. 161 886–892.

2013 © Institute of Mathematical Statistics

The Annals of Applied Statistics

The Annals of Applied Statistics

Turn MathJax Off
What is MathJax?