Translator Disclaimer
June 2021 Efficient Bayesian inference of general Gaussian models on large phylogenetic trees
Paul Bastide, Lam Si Tung Ho, Guy Baele, Philippe Lemey, Marc A. Suchard
Author Affiliations +
Ann. Appl. Stat. 15(2): 971-997 (June 2021). DOI: 10.1214/20-AOAS1419

Abstract

Phylogenetic comparative methods correct for shared evolutionary history among a set of nonindependent organisms by modeling sample traits as arising from a diffusion process along the branches of a possibly unknown history. To incorporate such uncertainty, we present a scalable Bayesian inference framework under a general Gaussian trait evolution model that exploits Hamiltonian Monte Carlo (HMC). HMC enables efficient sampling of the constrained model parameters and takes advantage of the tree structure for fast likelihood and gradient computations, yielding algorithmic complexity linear in the number of observations. This approach encompasses a wide family of stochastic processes, including the general Ornstein–Uhlenbeck (OU) process, with possible missing data and measurement errors. We implement inference tools for a biologically relevant subset of all these models into the BEAST phylogenetic software package and develop model comparison through marginal likelihood estimation. We apply our approach to study the morphological evolution in the superfamily of Musteloidea (including weasels and allies) as well as the heritability of HIV virulence. This second problem furnishes a new measure of evolutionary heritability that demonstrates its utility through a targeted simulation study.

Funding Statement

PB conducted this research as a postdoctoral fellow funded by the Fonds Wetenschappelijk Onderzoek (FWO, Belgium). The research leading to these results has received funding from the European Research Council under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 725422-ReservoirDOCS). The Artic Network receives funding from the Wellcome Trust through project 206298/Z/17/Z. PL acknowledges support by the Research Foundation—Flanders (“Fonds voor Wetenschappelijk Onderzoek—Vlaanderen,” G066215N, G0D5117N and G0B9317N).
GB acknowledges support from the Interne Fondsen KU Leuven / Internal Funds KU Leuven under grant agreement C14/18/094 and the Research Foundation—Flanders (“Fonds voor Wetenschappelijk Onderzoek—Vlaanderen,” G0E1420N).
LSTH was supported by startup funds from Dalhousie University, the Canada Research Chairs program, the NSERC Discovery Grant RGPIN-2018-05447 and the NSERC Discovery Launch Supplement DGECR-2018-00181.
MAS acknowledges support from National Institutes of Health Grant U19 AI135995 and U01 AI151812.

Acknowledgments

We are grateful to the INRAE MIGALE bioinformatics facility (MIGALE, INRAE, 2020, DOI:10.15454/1.5572390655343293E12) for providing computing resources.

PB thanks Pierre Gloaguen for an enlightening discussion about Fisher’s identity.

The authors thank Jan Schnitzler for sharing the alignment data to reproduce the Musteloidea analyses as well as Jeffrey S. Morris and two anonymous reviewers for their useful comments that helped improve this manuscript.

Citation

Download Citation

Paul Bastide. Lam Si Tung Ho. Guy Baele. Philippe Lemey. Marc A. Suchard. "Efficient Bayesian inference of general Gaussian models on large phylogenetic trees." Ann. Appl. Stat. 15 (2) 971 - 997, June 2021. https://doi.org/10.1214/20-AOAS1419

Information

Received: 1 March 2020; Revised: 1 September 2020; Published: June 2021
First available in Project Euclid: 12 July 2021

Digital Object Identifier: 10.1214/20-AOAS1419

Rights: Copyright © 2021 Institute of Mathematical Statistics

JOURNAL ARTICLE
27 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

SHARE
Vol.15 • No. 2 • June 2021
Back to Top