Bayesian Analysis

A grade of membership model for rank data

Isobel Claire Gormley and Thomas Brendan Murphy

Full-text: Open access

Abstract

A grade of membership (GoM) model is an individual level mixture model which allows individuals have partial membership of the groups that characterize a population. A GoM model for rank data is developed to model the particular case when the response data is ranked in nature. A Metropolis-within-Gibbs sampler provides the framework for model fitting, but the intricate nature of the rank data models makes the selection of suitable proposal distributions difficult. `Surrogate' proposal distributions are constructed using ideas from optimization transfer algorithms. Model fitting issues such as label switching and model selection are also addressed.

The GoM model for rank data is illustrated through an analysis of Irish election data where voters rank some or all of the candidates in order of preference. Interest lies in highlighting distinct groups of voters with similar preferences (i.e. `voting blocs') within the electorate, taking into account the rank nature of the response data, and in examining individuals' voting bloc memberships. The GoM model for rank data is fitted to data from an opinion poll conducted during the Irish presidential election campaign in 1997.

Article information

Source
Bayesian Anal., Volume 4, Number 2 (2009), 265-295.

Dates
First available in Project Euclid: 22 June 2012

Permanent link to this document
https://projecteuclid.org/euclid.ba/1340370278

Digital Object Identifier
doi:10.1214/09-BA410

Mathematical Reviews number (MathSciNet)
MR2507364

Zentralblatt MATH identifier
1330.62024

Keywords
Grade of membership models Plackett-Luce model surrogate proposal distributions rank data voting blocs

Citation

Gormley, Isobel Claire; Murphy, Thomas Brendan. A grade of membership model for rank data. Bayesian Anal. 4 (2009), no. 2, 265--295. doi:10.1214/09-BA410. https://projecteuclid.org/euclid.ba/1340370278


Export citation

References

  • Aitchison, J. (1986). The statistical analysis of compositional data. London: Chapman & Hall.
  • Aitchison, J. and Shen, S. (1980). "Logistic-normal distributions: Some properties and uses." Biometrika, 67(2): 261–272.
  • Bartholomew, D. J. and Knott, M. (1999). Latent variable models and factor analysis. London: Edward Arnold, second edition.
  • Benter, W. (1994). "Computer-based Horse Race Handicapping and Wagering Systems: A Report." In Ziemba, W. T., Lo, V. S., and Haush, D. B. (eds.), Efficiency of Racetrack Betting Markets, 183–198. San Diego and London: Academic Press.
  • Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). "Latent Dirichlet allocation." Journal of Machine Learning Research, 3: 993–1022.
  • Bradley, R. A. and Terry, M. E. (1952). "Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons." Biometrika, 39: 324–345.
  • Bradlow, E. T. and Fader, P. S. (2001). "A Bayesian Lifetime Model for the “Hot 100" Billboard Songs." Journal of the American Statistical Association, 96: 368–381.
  • Brooks, S. P. and Gelman, A. (1998). "General methods for monitoring convergence of iterative simulations." J. Comput. Graph. Statist., 7(4): 434–455.
  • Carlin, B. P. and Louis, T. A. (2000). Bayes and empirical Bayes methods for data analysis.. New York: Chapman & Hall, 2nd edition.
  • Celeux, G. (1998). "Bayesian inference for Mixtures: the label switching problem." In Proceedings Compstat, 227–232. Physica-Verlag.
  • Celeux, G., Hurn, M., and Robert, C. P. (2000). "Computational and inferential difficulties with mixture posterior distributions." Journal of the American Statistical Association, 95(451): 957–970.
  • Chapman, R. and Staelin, R. (1982). "Exploiting Rank Ordered Choice Set Data within the Stochastic Utility Model." Journal of Marketing Research, 19: 288–301.
  • Coakley, J. and Gallagher, M. (1999). Politics in the Republic of Ireland. London: Routledge in association with PSAI Press, 3rd edition.
  • Critchlow, D. E. (1985). Metric methods for analyzing partially ranked data. Lecture Notes in Statistics, 34. Berlin: Springer-Verlag.
  • Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). "Maximum likelihood from incomplete data via the EM algorithm." Journal of the Royal Statistical Society, Series B, 39(1): 1–38. With discussion.
  • Erosheva, E. (2002). "Grade of membership and latent structure models with application to disability survey data." Ph.D. thesis, Department of Statistics, Carnegie Mellon University.
  • Erosheva, E. A. (2003). "Bayesian Estimation of the Grade of Membership Model." In Bernardo, J., Bayarri, M., Berger, J., Dawid, A., Heckerman, D., Smith, A., and West, M. (eds.), Bayesian Statistics, 7, 501 – 510. Oxford: Oxford University Press.
  • –- (2006). "Latent class representation of the Grade of Membership Model." Technical Report 492, Department of Statistics, University of Washington.
  • Erosheva, E. A., Fienberg, S. E., and Joutard, C. (2007). "Describing disability through individual-level mixture models for multivariate binary data." The Annals of Applied Statistics, 1(2): 502–537.
  • Farrell, D. M. (2001). Electoral Systems: A Comparative Introduction. New York: St. Martin's Press.
  • Fligner, M. A. and Verducci, J. S. (1988). "Multistage Ranking Models." Journal of the American Statistical Association, 83: 892–901.
  • Fraley, C. and Raftery, A. E. (2002). "Model-Based Clustering, Discriminant Analysis, and Density Estimation." Journal of the American Statistical Association, 97(458): 611–631.
  • Gelman, A. and Rubin, D. B. (1992). "Inference From Iterative Simulation Using Multiple Sequences (Disc: P483-501, 503-511)." Statistical Science, 7: 457–472.
  • Gilks, W. R., Richardson, S., and Spiegelhalter, D. J. (eds.) (1996). Markov chain Monte Carlo in practice. London: Chapman & Hall.
  • Gormley, I. C. (2006). "Statistical Models for Rank Data." Ph.D. thesis, Department of Statistics, University of Dublin, Trinity College.
  • Gormley, I. C. and Murphy, T. B. (2006). "A Latent Space Model for Rank Data." In Arioldi, E., Blei, D., Fienberg, S., Goldenberg, A., Xing, E., and Zheng, A. (eds.), Statistical Network Analysis: Models, Issues, and New Directions, volume 4503 of Lecture Notes in Computer Science, 90–102. Springer Verlag.
  • –- (2006). "Analysis of Irish third-level college applications data." Journal of the Royal Statistical Society, Series A, 169(2): 361–-379.
  • –- (2008). "Exploring Voting Blocs Within the Irish Electorate: A Mixture Modeling Approach." Journal of the American Statistical Association, To appear.
  • –- (2008). "A mixture of experts model for rank data with applications in election studies." The Annals of Applied Statistics, To appear.
  • Graves, T., Reese, C. S., and Fitzgerald, M. (2003). "Hierarchical Models for Permutations: Analysis of Auto Racing Results." Journal of the American Statistical Association, 98: 282–291.
  • Haberman, S. J. (1995). "Book review of `Statistical Applications Using Fuzzy Sets', by Manton, K. G., Woodbury, M. A. and Corder, L.S." Journal of the American Statistical Association, 90: 1131–1133.
  • Hunter, D. R. and Lange, K. (2004). "A tutorial on MM" algorithms. The American Statistician, 58(1): 30–37.
  • Jasra, A., Holmes, C. C., and Stephens, D. A. (2005). "Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture Modeling." Statistical Science, 20: 50–67.
  • Joutard, C., Airoldi, E., Fienberg, S., and Love, T. (2008). "Discovery of latent patterns with hierarchical Bayesian mixed-membership models and the issue of model choice." In Poncelet, P., Masseglia, F., and Teisseire, M. (eds.), Data Mining Patterns: New Methods and Applications. Pennsylvania: IGI Global.
  • Lange, K., Hunter, D. R., and Yang, I. (2000). "Optimization transfer using surrogate objective functions." Journal of Computational and Graphical Statistics, 9(1): 1–59. With discussion, and a rejoinder by Hunter and Lange.
  • Lazarsfeld, P. and Henry, N. W. (1968). Latent structure analysis. Boston: Houghton Mifflin.
  • Mallows, C. L. (1957). "Non-null ranking models. I." Biometrika, 44: 114–130.
  • Marden, J. I. (1995). Analyzing and modeling rank data. London: Chapman & Hall.
  • Marsh, M. (1999). "The Making of the Eighth President." In Marsh, M. and Mitchell, P. (eds.), How Ireland Voted 1997, 215–242. Boulder, CO: Westview and PSAI Press.
  • McLachlan, G. J. and Peel, D. (2000). Finite Mixture models. New York: John Wiley & Sons.
  • O'Hagan, A. and Forster, J. (2004). Kendall's Advanced Theory of Statistics: Volume 2B Bayesian Inference. London, UK: Arnold, second edition.
  • Plackett, R. L. (1975). "The analysis of permutations." Applied Statistics, 24(2): 193–202.
  • Pritchard, J. K., Stephens, M., and Donnelly, P. (2000). "Inference of Population Structure Using Multilocus Genotype Data." Genetics, 155: 945–959.
  • Raftery, A. E., Newton, M. A., Satagopan, J. M., and Krivitsky, P. N. (2007). "Estimating the Integrated Likelihood via Posterior Simulation Using the Harmonic Mean Identity (with Discussion)." In Bernardo, J., Bayarri, M., Berger, J., Dawid, A., Heckerman, D., Smith, A., and West, M. (eds.), Bayesian Statistics 8, 1–45. Oxford: Oxford University Press.
  • Regenwetter, M., Grofman, B., Marley, A. A. J., and Tsetlin, I. M. (2006). Behavioral Social Choice. Probabilistic Models, Statistical Inference and Applications.. New York: Cambridge University Press.
  • Richardson, S. and Green, P. J. (1997). "On Bayesian Analysis of Mixtures With An Unknown Number of Components." Journal of the Royal Statistical Society, Series B, 59: 731–758.
  • Rosén, B. (1972). "Asymptotic Theory for Successive Sampling with Varying Probabilities Without Replacement, I." The Annals of Mathematical Statistics, 43(2): 373–397.
  • Sinnott, R. (1995). Irish voters decide: Voting behaviour in elections and referendums since 1918. Manchester: Manchester University Press.
  • –- (1999). "The Electoral System." In Coakley, J. and Gallagher, M. (eds.), Politics in the Republic of Ireland, 99–126. London: Routledge & PSAI Press, 3rd edition.
  • Spiegelhalter, D. J., Best, N. G., Carlin, B. P., and van der Linde, A. (2002). "Bayesian measures of model complexity and fit." Journal of the Royal Statistical Society, Series B, 64(4): 583–639.
  • Stephens, M. (2000). "Dealing with label-switching in mixture models." Journal of the Royal Statistical Society, Series B, 62(4): 795–810.
  • Thurstone, L. L. (1927). "A law of comparative judgement." Psychological Review, 34: 273–286.
  • Train, K. E. (2003). Discrete Choice Methods with Simulation. Cambridge: Cambridge University Press.
  • van der Brug, W., van der Eijk, C., and Marsh, M. (2000). "Exploring Uncharted Territory: The Irish Presidential Election 1997." British Journal of Political Science, 30(4): 631–650.
  • Woodbury, M., Clive, J., and Garson, A. (1978). "Mathematical typology: A grade of membership technique for obtaining disease definition." Computers and Biomedical Research, 11: 277–298.