Abstract
We consider species tree estimation under a standard stochastic model of gene tree evolution that incorporates incomplete lineage sorting (as modeled by a coalescent process) and gene duplication and loss (as modeled by a branching process). Through a probabilistic analysis of the model, we derive sample complexity bounds for widely used quartet-based inference methods that highlight the effect of the duplication and loss rates in both subcritical and supercritical regimes.
Funding Statement
SR was supported by NSF Grants DMS-1614242, CCF-1740707 (TRIPODS), DMS-1902892, DMS-1916378 and DMS-2023239 (TRIPODS Phase II), as well as a Simons Fellowship and a Vilas Associates Award.
BL was supported by NSF Grants DMS-1614242, CCF-1740707 (TRIPODS), DMS-1902892 and a Vilas Associates Award (to SR).
MH was supported by NSF Grant DMS-1902892 and DMS-2023239 (TRIPODS Phase II) as well as a Vilas Associates Award (to SR).
Citation
Max Hill. Brandon Legried. Sebastien Roch. "Species tree estimation under joint modeling of coalescence and duplication: Sample complexity of quartet methods." Ann. Appl. Probab. 32 (6) 4681 - 4705, December 2022. https://doi.org/10.1214/22-AAP1799
Information