The Annals of Applied Statistics

A mixture of experts model for rank data with applications in election studies

Isobel Claire Gormley and Thomas Brendan Murphy

Full-text: Open access


A voting bloc is defined to be a group of voters who have similar voting preferences. The cleavage of the Irish electorate into voting blocs is of interest. Irish elections employ a “single transferable vote” electoral system; under this system voters rank some or all of the electoral candidates in order of preference. These rank votes provide a rich source of preference information from which inferences about the composition of the electorate may be drawn. Additionally, the influence of social factors or covariates on the electorate composition is of interest.

A mixture of experts model is a mixture model in which the model parameters are functions of covariates. A mixture of experts model for rank data is developed to provide a model-based method to cluster Irish voters into voting blocs, to examine the influence of social factors on this clustering and to examine the characteristic preferences of the voting blocs. The Benter model for rank data is employed as the family of component densities within the mixture of experts model; generalized linear model theory is employed to model the influence of covariates on the mixing proportions. Model fitting is achieved via a hybrid of the EM and MM algorithms. An example of the methodology is illustrated by examining an Irish presidential election. The existence of voting blocs in the electorate is established and it is determined that age and government satisfaction levels are important factors in influencing voting in this election.

Article information

Ann. Appl. Stat., Volume 2, Number 4 (2008), 1452-1477.

First available in Project Euclid: 8 January 2009

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Rank data mixture models generalized linear models EM algorithm MM algorithm


Gormley, Isobel Claire; Murphy, Thomas Brendan. A mixture of experts model for rank data with applications in election studies. Ann. Appl. Stat. 2 (2008), no. 4, 1452--1477. doi:10.1214/08-AOAS178.

Export citation


  • Airoldi, E. M., Blei, D. M., Fienberg, S. E. and Xing, E. P. (2008). Mixed membership stochastic blockmodels. J. Machine Learning Research 9 1981–2014.
  • Benter, W. (1994). Computer-based horse race handicapping and wagering systems: A report. In Efficiency of Racetrack Betting Markets (W. T. Ziemba, V. S. Lo and D. B. Haush, eds.) 183–198. Academic Press, San Diego.
  • Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer, New York.
  • Bishop, C. M. and Svensén, M. (2003). Bayesian hierarchical mixture of experts. In Proceedings Nineteenth Conference on Uncertainty in Artificial Intelligence 57–64.
  • Böhning, D., Dietz, E., Schaub, R., Schlattmann, P. and Lindsay, B. (1994). The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Ann. Instit. Statist. Math. 46 373–388.
  • Breiman, L., Friedman, J., Olshen, R. and Stone, C. (2006). Classification and Regression Trees, 3rd ed. Routledge PSAI Press, London.
  • Busse, L. M., Orbanz, P. and Buhmann, J. M. (2007). Cluster analysis of heterogeneous rank data. In ICML 2007: Proceedings of the 24th International Conference on Machine Learning 113–120. ACM, New York.
  • Coakley, J. and Gallagher, M. (1999). Politics in the Republic of Ireland, 3rd ed. Routledge PSAI Press, London.
  • Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. Roy. Statist. Soc. Ser. B 39 1–38.
  • Dobson, A. J. (2002). An Introduction to Generalized linear Models, 2nd ed. Chapman and Hall, London.
  • Erosheva, E. A., Fienberg, S. E. and Joutard, C. (2007). Describing disability through individual-level mixture models for multivariate binary data. Ann. Appl. Statist. 1 502–537.
  • Fligner, M. A. and Verducci, J. S. (1986). Distance based ranking models. J. Roy. Statist. Soc. Ser. B 48 359–369.
  • Fligner, M. A. and Verducci, J. S. (1988). Multistage ranking models. J. Amer. Statist. Assoc. 83 892–901.
  • Fraley, C. and Raftery, A. E. (1998). How many clusters? Which clustering methods? Answers via model-based cluster analysis. Computer J. 41 578–588.
  • Friedman, J. (1991). Multivariate adaptive regression splines. Ann. Statist. 19 1–141.
  • Gelman, A. and Rubin, D. B. (1995). Avoiding model selection in Bayesian social research. Soc. Methodol. 25 165–173.
  • Gordon, A. D. (1979). A measure of the agreement between rankings. Biometrika 66 7–15.
  • Gormley, I. C. (2006). Statistical models for rank data. Ph.D. thesis, Univ. Dublin, Trinity College.
  • Gormley, I. C. and Murphy, T. B. (2006). Analysis of Irish third-level college applications data. J. Roy. Statist. Soc. Ser. A 169 361–379.
  • Gormley, I. C. and Murphy, T. B. (2007). A latent space model for rank data. Statistical Network Analysis: Models, Issues and New Directions. Lecture Notes in Comput. Sci. 4503 90–107. Springer, Berlin.
  • Gormley, I. C. and Murphy, T. B. (2008a). Exploring voting blocs within the Irish electorate: A mixture modeling approach. J. Amer. Statist. Assoc. 103 1014–1027.
  • Gormley, I. C. and Murphy, T. B. (2008b). A grade of membership model for rank data. Technical report, Univ. College Dublin. Available at
  • Gormley, I. C. and Murphy, T. B. (2008c). Supplement to “A mixture of experts model for rank data with applications in election studies.” DOI: 10.1214/08-AOAS178SUPP.
  • Hill, J. L. (2001). Accommodating missing data in mixture models for classification by opinion-changing behavior. J. Educational and Behavioral Statistics 26 233–268.
  • Holloway, S. (1990). Forty years of united Nations General Assembly voting. Canad. J. Political Sci. / Revue Canad. Sci. Politique 23 279–296.
  • Hunter, D. R. (2004). MM algorithms for generalized Bradley–Terry models. Ann. Statist. 32 384–406.
  • Hunter, D. R. and Lange, K. (2004). A tutorial on MM algorithms. Amer. Statist. 58 30–37.
  • Jacobs, R. A., Jordan, M. I., Nowlan, S. J. and Hinton, G. E. (1991). Adaptive mixture of local experts. Neural Computation 3 79–87.
  • Jakulin, A. and Buntine, W. (2004). Analyzing the US Senate in 2003: Similarities, Networks, Clusters and Blocs. Preprint. Available at
  • Jordan, M. I. and Jacobs, R. A. (1994). Hierarchical mixture of experts and the EM algorithm. Neural Computation 6 181–214.
  • Kass, R. E. and Raftery, A. E. (1995). Bayes factors. J. Amer. Statist. Assoc. 90 773–795.
  • Keribin, C. (1998). Estimation consistante de l’ordre de modèles de mélange. C. R. Acad. Sci. Paris Sér. I Math. 326 243–248.
  • Keribin, C. (2000). Consistent estimation of the order of mixture models. Sankhyā Ser. A 62 49–66.
  • Lange, K., Hunter, D. R. and Yang, I. (2000). Optimization transfer using surrogate objective functions (with discussion). J. Comput. Graph. Statist. 9 1–59.
  • Leroux, B. G. (1992). Consistent estimation of a mixing distribution. Ann. Statist. 20 1350–1360.
  • Mallows, C. L. (1957). Nonnull ranking models. Biometrika 44 114–130.
  • Marden, J. I. (1995). Analyzing and Modeling Rank Data. Chapman and Hall, London.
  • Marsh, M. (1999). The making of the eight president. In How Ireland Voted 1997 (M. Marsh and P. Mitchell, eds.) 215–242. Westview Press, Boulder, CO.
  • McCullagh, P. and Nelder, J. A. (1983). Generalized Linear Models. Chapman and Hall, London.
  • McFadden, D. G. (1978). Modelling the choice of residential location. Spatial Interaction Theory and Planning Models 75–96.
  • McLachlan, G. J. and Krishnan, T. (1997). The EM Algorithm and Extensions. Wiley, New York.
  • McLachlan, G. J. and Peel, D. (2000). Finite Mixture Models. Wiley, New York.
  • Meng, X.-L. and Rubin, D. B. (1993). Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika 80 267–278.
  • Murphy, T. B. and Martin, D. (2003). Mixtures of distance-based models for ranking data. Comput. Statist. Data Anal. 41 645–655.
  • Peng, F., Jacobs, R. A. and Tanner, M. A. (1996). Bayesian inference in mixtures-of-experts and hierarchical mixtures-of-experts models with application to speech recognition. J. Amer. Statist. Assoc. 91 953–960.
  • Plackett, R. L. (1975). The analysis of permutations. Appl. Statist. 24 193–202.
  • Pritchard, J. K., Stephens, M. and Peter, D. (2000). Inference of population structure using multilocus genotype data. Genetics 155 945–959.
  • Raftery, A. E. (1995). Bayesian model selection in social research. Soc. Methodol. 25 111–163.
  • Schwartz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461–464.
  • Sinnott, R. (1995). Irish Voters Decide: Voting Behaviour in Elections and Referendums Since 1918. Manchester Univ. Press.
  • Sinnott, R. (1999). The electoral system, In Politics in the Republic of Ireland (J. Coakley and M. Gallagher, eds.) 99–126. Routledge & PSAI Press, London.
  • Tam, W. K. (1995). Asians—A monolithic voting bloc? Political Behaviour 17 223–249.
  • Tanner, M. A. (1996). Tools for Statistical Inference, 3rd ed. Springer, New York.
  • Train, K. E. (2003). Discrete Choice Methods with Simulation. Cambridge Univ. Press.
  • van der Brug, W., van der Eijk, C. and Marsh, M. (2000). Exploring uncharted territory: The Irish presidential election 1997. British J. Political Science 30 631–650.

Supplemental materials