While the study of a single network is well established, technological advances now allow for the collection of multiple networks with relative ease. Increasingly, anywhere from several to thousands of networks can be created from brain imaging, gene coexpression data, or microbiome measurements. And these networks, in turn, are being looked to as potentially powerful features to be used in modeling. However, with networks being non-Euclidean in nature, how best to incorporate them into standard modeling tasks is not obvious. In this paper we propose a Bayesian modeling framework that provides a unified approach to binary classification, anomaly detection, and survival analysis with network inputs. We encode the networks in the kernel of a Gaussian process prior via their pairwise differences, and we discuss several choices of provably positive definite kernel that can be plugged into our models. Although our methods are widely applicable, we are motivated here, in particular, by microbiome research (where network analysis is emerging as the standard approach for capturing the interconnectedness of microbial taxa across both time and space) and its potential for reducing preterm delivery and improving personalization of prenatal care.
NJ and EK were supported in part by ARO award W911NF1810237.
NJ was also partially supported by NIH/NICHD grant 1DP2HD091799-01.
LL would like to acknowledge the generous support of NSF grants DMS CAREER 1654579 and DMS 2113642.
We are grateful to the Editor, the Associate Editor, and three reviewers for their valuable comments which have led to substantial improvement in our paper. We would like to thank Evan Johnson for a very useful discussion on the microbiome data analysis.
Nathaniel Josephs. Lizhen Lin. Steven Rosenberg. Eric D. Kolaczyk. "Bayesian classification, anomaly detection, and survival analysis using network inputs with application to the microbiome." Ann. Appl. Stat. 17 (1) 199 - 224, March 2023. https://doi.org/10.1214/22-AOAS1623