The Annals of Applied Probability

Learning nonsingular phylogenies and hidden Markov models

Elchanan Mossel and Sébastien Roch

Full-text: Open access


In this paper we study the problem of learning phylogenies and hidden Markov models. We call a Markov model nonsingular if all transition matrices have determinants bounded away from 0 (and 1). We highlight the role of the nonsingularity condition for the learning problem. Learning hidden Markov models without the nonsingularity condition is at least as hard as learning parity with noise, a well-known learning problem conjectured to be computationally hard. On the other hand, we give a polynomial-time algorithm for learning nonsingular phylogenies and hidden Markov models.

Article information

Ann. Appl. Probab., Volume 16, Number 2 (2006), 583-614.

First available in Project Euclid: 29 June 2006

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 60J10: Markov chains (discrete-time Markov processes on discrete state spaces) 60J20: Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) [See also 90B30, 91D10, 91D35, 91E40] 68T05: Learning and adaptive systems [See also 68Q32, 91E40] 92B10: Taxonomy, cladistics, statistics

Hidden Markov models evolutionary trees phylogenetic reconstruction PAC learning


Mossel, Elchanan; Roch, Sébastien. Learning nonsingular phylogenies and hidden Markov models. Ann. Appl. Probab. 16 (2006), no. 2, 583--614. doi:10.1214/105051606000000024.

Export citation


  • Abe, N. and Warmuth, N. K. (1992). On the computational complexity of approximating probability distributions by probabilistic automata. Machine Learning 9 205--260.
  • Ambainis, A., Desper, R., Farach, M. and Kannan, S. (1997). Nearly tight bounds on the learnability of evolution. In Proceedings of the 38th Annual Symposium on Foundations of Computer Science 524--533. IEEE Computer Society, Washington, DC.
  • Blum, A., Furst, M., Jackson, J., Kearns, M., Mansour, Y. and Rudich, S. (1994). Weakly learning DNF and characterizing statistical query learning using fourier analysis. In Proceedings of the Twenty-Sixth Annual ACM Symposium on Theory of Computing 253--262. ACM Press, New York.
  • Blum, A., Kalai, A. and Wasserman, H. (2003). Noise-tolerant learning, the parity problem, and the statistical query model. J. ACM 50 506--519.
  • Chang, J. T. (1996). Full reconstruction of Markov models on evolutionary trees: Identifiability and consistency. Math. Biosci. 137 51--73.
  • Chor, B. and Tuller, T. (2005). Maximum likelihood of evolutionary trees is hard. Proceedings of Research in Computational Molecular Biology: 9th Annual International Conference. Lecture Notes in Comput. Sci. 3500 296--310. Springer, Berlin.
  • Cryan, M., Goldberg, L. A. and Goldberg, P. W. (2002). Evolutionary trees can be learned in polynomial time in the two-state general Markov model. SIAM J. Comput. 31 375--397.
  • Durbin, R., Eddy, S., Krogh, A. and Mitchison, G. (1998). Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge Univ. Press.
  • Durrett, R. (1996). Probability: Theory and Examples, 2nd ed. Duxbury, Belmont, CA.
  • Erdos, P. L., Steel, M., Szekely, L. and Warnow, T. (1997). A few logs suffice to build (almost) all trees. I. Random Structures Algorithms 14 153--184.
  • Erdos, P. L., Steel, M. A., Szekely, L. A. and Warnow, T. J. (1999). A few logs suffice to build (almost) all trees. II. Theoret. Comput. Sci. 221 77--118.
  • Felsenstein, J. (2004). Inferring Phylogenies. Sinauer, New York.
  • Farach, M. and Kannan, S. (1999). Efficient algorithms for inverting evolution. J. ACM 46 437--449.
  • Feldman, J., O'Donnell, R. and Servedio, R. (2005). Learning mixtures of product distributions over discrete domains. In Proceedings of 46th Symposium on Foundations of Computer Science 501--510. IEEE Computer Society.
  • Felsenstein, J. (1978). Cases in which parsimony or compatibility methods will be positively misleading. Systematic Zoology 27 410--410.
  • Garey, M. R. and Johnson, D. S. (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, San Francisco, CA.
  • Golub, G. H. and Van Loan, C. H. (1996). Matrix Computations, 3rd ed. Johns Hopkins Univ. Press.
  • Graham, R. L. and Foulds, L. R. (1982). Unlikelihood that minimal phylogenies for a realistic biological study can be constructed in reasonable computational time. Math. Biosci. 60 133--142.
  • Helmbold, D., Sloan, R. and Warmuth, M. (1992). Learning integer lattices. SIAM J. Comput. 21 240--266.
  • Horn, R. A. and Johnson, C. R. (1985). Matrix Analysis. Cambridge Univ. Press.
  • Kearns, M. J. (1998). Efficient noise-tolerant learning from statistical queries. J. ACM 45 983--1006.
  • Kearns, M. J., Mansour, Y., Ron, D., Rubinfeld, R., Schapire, R. E. and Sellie, L. (1994). On the learnability of discrete distributions. In Proceedings of the Twenty-Sixth Annual ACM Symposium on Theory of Computing 273--282. ACM Press, New York.
  • Kearns, M. J. and Vazirani, U. V. (1994). An Introduction to Computational Learning Theory. MIT Press.
  • Lyngs, R. B. and Pedersen, C. N. S. (2001). Complexity of comparing hidden Markov models. Proceedings of Algorithms and Computation, 12th International Symposium. Lecture Notes in Comput. Sci. 2223 416--428. Springer, Berlin.
  • Mossel, E. (2003). On the impossibility of reconstructing ancestral data and phylogenies. J. Comput. Biol. 10 669--678.
  • Mossel, E. (2004). Distorted metrics on trees and phylogenetic forests. Preprint. Available at
  • Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77 257--286.
  • Rice, K. and Warnow, T. (1997). Parsimony is hard to beat! Proceedings of the Third Annual International Conference on Computing and Combinatorics. Lecture Notes in Comput. Sci. 1276 124--133. Springer, Berlin.
  • Roch, S. (2006). A short proof that phylogenetic tree reconstruction by maximum likelihood is hard. In IEEE/ACM Transactions on Computational Biology and Bioinformatics 3 92--94.
  • Semple, C. and Steel, M. (2003). Phylogenetics. Oxford Univ. Press.
  • Steel, M. (1994). Recovering a tree from the leaf colourations it generates under a Markov model. Appl. Math. Lett. 7 19--24.
  • Valiant, L. G. (1984). A theory of the learnable. Comm. ACM 27 1134--1142.