The Annals of Applied Probability

The shape of the one-dimensional phylogenetic likelihood function

Vu Dinh and Frederick A. Matsen IV

Full-text: Open access

Abstract

By fixing all parameters in a phylogenetic likelihood model except for one branch length, one obtains a one-dimensional likelihood function. In this work, we introduce a mathematical framework to characterize the shapes of such one-dimensional phylogenetic likelihood functions. This framework is based on analyses of algebraic structures on the space of all frequency patterns with respect to a polynomial representation of the likelihood functions. Using this framework, we provide conditions under which the one-dimensional phylogenetic likelihood functions are guaranteed to have at most one stationary point, and this point is the maximum likelihood branch length. These conditions are satisfied by common simple models including all binary models, the Jukes–Cantor model and the Felsenstein 1981 model.

We then prove that for the simplest model that does not satisfy our conditions, namely, the Kimura 2-parameter model, the one-dimensional likelihood functions may have multiple stationary points. As a proof of concept, we construct a nondegenerate example in which the phylogenetic likelihood function has two local maxima and a local minimum. To construct such examples, we derive a general method of constructing a tree and sequence data with a specified frequency pattern at the root. We then extend the result to prove that the space of all rescaled and translated one-dimensional phylogenetic likelihood functions under the Kimura 2-parameter model is dense in the space of all nonnegative continuous functions on $[0,\infty)$ with finite limits. These results indicate that one-dimensional likelihood functions under advanced evolutionary models can be more complex than it is typically assumed by phylogenetic inference algorithms; however, these complexities can be effectively captured by the Kimura 2-parameter model.

Article information

Source
Ann. Appl. Probab. Volume 27, Number 3 (2017), 1646-1677.

Dates
Received: July 2015
Revised: July 2016
First available in Project Euclid: 19 July 2017

Permanent link to this document
https://projecteuclid.org/euclid.aoap/1500451237

Digital Object Identifier
doi:10.1214/16-AAP1240

Subjects
Primary: 05C05: Trees 92B10: Taxonomy, cladistics, statistics
Secondary: 05C25: Graphs and abstract algebra (groups, rings, fields, etc.) [See also 20F65] 92D15: Problems related to evolution

Keywords
Evolutionary model molecular evolution phylogenetics likelihood model characteristic polynomial algebraic representation multimodality universal model

Citation

Dinh, Vu; Matsen IV, Frederick A. The shape of the one-dimensional phylogenetic likelihood function. Ann. Appl. Probab. 27 (2017), no. 3, 1646--1677. doi:10.1214/16-AAP1240. https://projecteuclid.org/euclid.aoap/1500451237


Export citation

References

  • [1] Ben-Israel, A. and Mond, B. (1986). What is invexity? J. Aust. Math. Soc. A 28 1–9.
  • [2] Bryant, D., Galtier, N. and Poursat, M.-A. (2005). Likelihood calculation in molecular phylogenetics. In Mathematics of Evolution and Phylogeny 33–62.
  • [3] Chor, B., Hendy, M. D., Holland, B. R. and Penny, D. (2000). Multiple maxima of likelihood in phylogenetic trees: An analytic approach. Mol. Biol. Evol. 17 1529–1541.
  • [4] Farouki, R. T. (2012). The Bernstein polynomial basis: A centennial retrospective. Comput. Aided Geom. Design 29 379–419.
  • [5] Felsenstein, J. (1981). Evolutionary trees from DNA sequences: A maximum likelihood approach. J. Mol. Evol. 17 368–376.
  • [6] Felsenstein, J. (2004). Inferring Phylogenies. Sinauer Associates, Sunderland.
  • [7] Fukami, K. and Tateno, Y. (1989). On the maximum likelihood method for estimating molecular trees: Uniqueness of the likelihood point. J. Mol. Evol. 28 460–464.
  • [8] Hanson, M. A. (1981). On sufficiency of the Kuhn–Tucker conditions. J. Math. Anal. Appl. 80 545–550.
  • [9] Hayashida, T. (1949). Arc-wise connected subgroup of a vector group. Kodai Math. Semin. Rep. 1 16–16.
  • [10] Jukes, T. H. and Cantor, C. R. (1969). Evolution of protein molecules. Mammalian Protein Metabolism 3 21–132.
  • [11] Kimura, M. (1980). A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16 111–120.
  • [12] Rogers, J. S. and Swofford, D. L. (1999). Multiple local maxima for likelihoods of phylogenetic trees: A simulation study. Mol. Biol. Evol. 16 1079–1085.
  • [13] Steel, M. (1994). The maximum likelihood point for a phylogenetic tree is not unique. Syst. Biol. 43 560–564.