Annals of Statistics

Identifiability of parameters in latent structure models with many observed variables

Elizabeth S. Allman, Catherine Matias, and John A. Rhodes

Full-text: Open access


While hidden class models of various types arise in many statistical applications, it is often difficult to establish the identifiability of their parameters. Focusing on models in which there is some structure of independence of some of the observed variables conditioned on hidden ones, we demonstrate a general approach for establishing identifiability utilizing algebraic arguments. A theorem of J. Kruskal for a simple latent-class model with finite state space lies at the core of our results, though we apply it to a diverse set of models. These include mixtures of both finite and nonparametric product distributions, hidden Markov models and random graph mixture models, and lead to a number of new results and improvements to old ones.

In the parametric setting, this approach indicates that for such models, the classical definition of identifiability is typically too strong. Instead generic identifiability holds, which implies that the set of nonidentifiable parameters has measure zero, so that parameter inference is still meaningful. In particular, this sheds light on the properties of finite mixtures of Bernoulli products, which have been used for decades despite being known to have nonidentifiable parameters. In the nonparametric setting, we again obtain identifiability only when certain restrictions are placed on the distributions that are mixed, but we explicitly describe the conditions.

Article information

Ann. Statist., Volume 37, Number 6A (2009), 3099-3132.

First available in Project Euclid: 17 August 2009

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62E10: Characterization and structure theory
Secondary: 62F99: None of the above, but in this section 62G99: None of the above, but in this section

Identifiability finite mixture latent structure conditional independence multivariate Bernoulli mixture nonparametric mixture contingency table algebraic statistics


Allman, Elizabeth S.; Matias, Catherine; Rhodes, John A. Identifiability of parameters in latent structure models with many observed variables. Ann. Statist. 37 (2009), no. 6A, 3099--3132. doi:10.1214/09-AOS689.

Export citation


  • [1] Abo, H., Ottaviani, G. and Peterson, C. (2009). Induction for secant varieties of Segre varieties. Trans. Amer. Math. Soc. 361 767–792. Available at arXiv:math.AG/0607191.
  • [2] Allman, E. S. and Rhodes, J. A. (2009). The identifiability of tree topology for phylogenetic models, including covarion and mixture models. J. Comput. Biol. 13 1101–1113.
  • [3] Allman, E. S. and Rhodes, J. A. (2009). The identifiability of covarion models in phylogenetics. IEEE/ACM Trans. Comput. Biol. Bioinformatics 6 76–88.
  • [4] Benaglia, T., Chauveau, D. and Hunter, D. R. (2009). An EM-like algorithm for semi and nonparametric estimation in multivariate mixtures. J. Comput. Graph. Statist. To appear.
  • [5] Cappé, O., Moulines, E. and Rydén, T. (2005). Inference in Hidden Markov Models. Springer, New York.
  • [6] Carreira-Perpiñán, M. and Renals, S. (2000). Practical identifiability of finite mixtures of multivariate Bernoulli distributions. Neural Comput. 12 141–152.
  • [7] Catalisano, M. V., Geramita, A. V. and Gimigliano, A. (2002). Ranks of tensors, secant varieties of Segre varieties and fat points. Linear Algebra Appl. 355 263–285.
  • [8] Catalisano, M. V., Geramita, A. V. and Gimigliano, A. (2005). Higher secant varieties of the Segre varieties ℙ1×⋯×ℙ1. J. Pure Appl. Algebra 201 367–380.
  • [9] Chambaz, A. and Matias, C. (2009). Number of hidden states and memory: A joint order estimation problem for Markov chains with Markov regime. ESAIM Probab. Stat. 13 38–50.
  • [10] Cox, D., Little, J. and O’Shea, D. (1997). Ideals, Varieties, and Algorithms, 2nd ed. Springer, New York.
  • [11] Cruz-Medina, I. R., Hettmansperger, T. P. and Thomas, H. (2004). Semiparametric mixture models and repeated measures: The multinomial cut point model. J. Roy. Statist. Soc. Ser. C 53 463–474.
  • [12] Daudin, J.-J., Picard, F. and Robin, S. (2008). A mixture model for random graphs. Statist. Comput. 18 173–183.
  • [13] Drton, M. (2006). Algebraic techniques for Gaussian models. In Prague Stochastics (M. Huskova and M. Janzura, eds.) 81–90. Matfyz Press, Prague.
  • [14] Elmore, R., Hall, P. and Neeman, A. (2005). An application of classical invariant theory to identifiability in nonparametric mixtures. Ann. Inst. Fourier (Grenoble) 55 1–28.
  • [15] Elmore, R. T., Hettmansperger, T. P. and Thomas, H. (2004). Estimating component cumulative distribution functions in finite mixture models. Comm. Statist. Theory Methods 33 2075–2086.
  • [16] Ephraim, Y. and Merhav, N. (2002). Hidden Markov processes. IEEE Trans. Inform. Theory 48 1518–1569.
  • [17] Finesso, L. (1991). Consistent estimation of the order for Markov and hidden Markov chains. Ph.D. thesis, Univ. Maryland.
  • [18] Frank, O. and Harary, F. (1982). Cluster inference by using transitivity indices in empirical graphs. J. Amer. Statist. Assoc. 77 835–840.
  • [19] Garcia, L. D., Stillman, M. and Sturmfels, B. (2005). Algebraic geometry of Bayesian networks. J. Symbolic Comput. 39 331–355.
  • [20] Gilbert, E. J. (1959). On the identifiability problem for functions of finite Markov chains. Ann. Math. Statist. 30 688–697.
  • [21] Glick, N. (1973). Sample-based multinomial classification. Biometrics 29 241–256.
  • [22] Goodman, L. A. (1974). Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61 215–231.
  • [23] Gyllenberg, M., Koski, T., Reilink, E. and Verlaan, M. (1994). Nonuniqueness in probabilistic numerical identification of bacteria. J. Appl. Probab. 31 542–548.
  • [24] Hall, P., Neeman, A., Pakyari, R. and Elmore, R. (2005). Nonparametric inference in multivariate mixtures. Biometrika 92 667–678.
  • [25] Hall, P. and Zhou, X.-H. (2003). Nonparametric estimation of component distributions in a multivariate mixture. Ann. Statist. 31 201–224.
  • [26] Hettmansperger, T. P. and Thomas, H. (2000). Almost nonparametric inference for repeated measures in mixture models. J. R. Stat. Soc. Ser. B Stat. Methodol. 62 811–825.
  • [27] Koopmans, T. C. and Reiersøl, O. (1950). The identification of structural characteristics. Ann. Math. Statist. 21 165–181.
  • [28] Koopmans, T. C., ed. (1950). Statistical Inference in Dynamic Economic Models. Wiley, New York.
  • [29] Kruskal, J. B. (1976). More factors than subjects, tests and treatments: An indeterminacy theorem for canonical decomposition and individual differences scaling. Psychometrika 41 281–293.
  • [30] Kruskal, J. B. (1977). Three-way arrays: Rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics. Linear Algebra Appl. 18 95–138.
  • [31] Lauritzen, S. L. (1996). Graphical Models Oxford Statistical Science Series 17. Clarendon Press, New York.
  • [32] Leroux, B. G. (1992). Maximum-likelihood estimation for hidden Markov models. Stochastic Process. Appl. 40 127–143.
  • [33] Lindsay, B. G. (1995). Mixture Models: Theory, Geometry and Applications. NSF-CBMS Regional Conference Series in Probability and Statistics 5. IMS, Hayward, CA.
  • [34] McLachlan, G. and Peel, D. (2000). Finite Mixture Models. Wiley, New York.
  • [35] Nigam, K., McCallum, A. K., Thrun, S. and Mitchell, T. M. (2000). Text classification from labeled and unlabeled documents using EM. Machine Learning 39 103–134.
  • [36] Nowicki, K. and Snijders, T. A. B. (2001). Estimation and prediction for stochastic blockstructures. J. Amer. Statist. Assoc. 96 1077–1087.
  • [37] Pachter, L. and Sturmfels, B. (2005). Algebraic Statistics for Computational Biology. Cambridge Univ. Press, Cambridge.
  • [38] Paz, A. (1971). Introduction to Probabilistic Automata. Academic Press, New York.
  • [39] Petrie, T. (1969). Probabilistic functions of finite state Markov chains. Ann. Math. Statist. 40 97–115.
  • [40] Rothenberg, T. J. (1971). Identification in parametric models. Econometrica 39 577–591.
  • [41] Tallberg, C. (2005). A Bayesian approach to modeling stochastic blockstructures with covariates. J. Math. Soc. 29 1–23.
  • [42] Teicher, H. (1967). Identifiability of mixtures of product measures. Ann. Math. Statist. 38 1300–1302.
  • [43] Yakowitz, S. J. and Spragins, J. D. (1968). On the identifiability of finite mixtures. Ann. Math. Statist. 39 209–214.
  • [44] Zanghi, H., Ambroise, C. and Miele, V. (2008). Fast online graph clustering via Erdös–Rényi mixture. Pattern Recognition 41 3592–3599.