Phylogenetic comparative methods correct for shared evolutionary history among a set of nonindependent organisms by modeling sample traits as arising from a diffusion process along the branches of a possibly unknown history. To incorporate such uncertainty, we present a scalable Bayesian inference framework under a general Gaussian trait evolution model that exploits Hamiltonian Monte Carlo (HMC). HMC enables efficient sampling of the constrained model parameters and takes advantage of the tree structure for fast likelihood and gradient computations, yielding algorithmic complexity linear in the number of observations. This approach encompasses a wide family of stochastic processes, including the general Ornstein–Uhlenbeck (OU) process, with possible missing data and measurement errors. We implement inference tools for a biologically relevant subset of all these models into the BEAST phylogenetic software package and develop model comparison through marginal likelihood estimation. We apply our approach to study the morphological evolution in the superfamily of Musteloidea (including weasels and allies) as well as the heritability of HIV virulence. This second problem furnishes a new measure of evolutionary heritability that demonstrates its utility through a targeted simulation study.
PB conducted this research as a postdoctoral fellow funded by the Fonds Wetenschappelijk Onderzoek (FWO, Belgium). The research leading to these results has received funding from the European Research Council under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 725422-ReservoirDOCS). The Artic Network receives funding from the Wellcome Trust through project 206298/Z/17/Z. PL acknowledges support by the Research Foundation—Flanders (“Fonds voor Wetenschappelijk Onderzoek—Vlaanderen,” G066215N, G0D5117N and G0B9317N).
GB acknowledges support from the Interne Fondsen KU Leuven / Internal Funds KU Leuven under grant agreement C14/18/094 and the Research Foundation—Flanders (“Fonds voor Wetenschappelijk Onderzoek—Vlaanderen,” G0E1420N).
LSTH was supported by startup funds from Dalhousie University, the Canada Research Chairs program, the NSERC Discovery Grant RGPIN-2018-05447 and the NSERC Discovery Launch Supplement DGECR-2018-00181.
MAS acknowledges support from National Institutes of Health Grant U19 AI135995 and U01 AI151812.
We are grateful to the INRAE MIGALE bioinformatics facility (MIGALE, INRAE, 2020, DOI:10.15454/1.5572390655343293E12) for providing computing resources.
PB thanks Pierre Gloaguen for an enlightening discussion about Fisher’s identity.
The authors thank Jan Schnitzler for sharing the alignment data to reproduce the Musteloidea analyses as well as Jeffrey S. Morris and two anonymous reviewers for their useful comments that helped improve this manuscript.
"Efficient Bayesian inference of general Gaussian models on large phylogenetic trees." Ann. Appl. Stat. 15 (2) 971 - 997, June 2021. https://doi.org/10.1214/20-AOAS1419