The Annals of Applied Statistics

Bayesian nonparametric Plackett–Luce models for the analysis of preferences for college degree programmes

François Caron, Yee Whye Teh, and Thomas Brendan Murphy

Full-text: Open access

Abstract

In this paper we propose a Bayesian nonparametric model for clustering partial ranking data. We start by developing a Bayesian nonparametric extension of the popular Plackett–Luce choice model that can handle an infinite number of choice items. Our framework is based on the theory of random atomic measures, with the prior specified by a completely random measure. We characterise the posterior distribution given data, and derive a simple and effective Gibbs sampler for posterior simulation. We then develop a Dirichlet process mixture extension of our model and apply it to investigate the clustering of preferences for college degree programmes amongst Irish secondary school graduates. The existence of clusters of applicants who have similar preferences for degree programmes is established and we determine that subject matter and geographical location of the third level institution characterise these clusters.

Article information

Source
Ann. Appl. Stat. Volume 8, Number 2 (2014), 1145-1181.

Dates
First available in Project Euclid: 1 July 2014

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1404229529

Digital Object Identifier
doi:10.1214/14-AOAS717

Mathematical Reviews number (MathSciNet)
MR3262549

Zentralblatt MATH identifier
06333791

Keywords
Ranking data permutations gamma process Dirichlet process mixture models

Citation

Caron, François; Teh, Yee Whye; Murphy, Thomas Brendan. Bayesian nonparametric Plackett–Luce models for the analysis of preferences for college degree programmes. Ann. Appl. Stat. 8 (2014), no. 2, 1145--1181. doi:10.1214/14-AOAS717. https://projecteuclid.org/euclid.aoas/1404229529


Export citation

References

  • Bertoin, J. (2006). Random Fragmentation and Coagulation Processes. Cambridge Studies in Advanced Mathematics 102. Cambridge Univ. Press, Cambridge.
  • Brix, A. (1999). Generalized gamma measures and shot-noise Cox processes. Adv. in Appl. Probab. 31 929–953.
  • Busse, L. M., Orbanz, P. and Buhmann, J. M. (2007). Cluster analysis of heterogeneous rank data. In Proceedings of the 24th International Conference on Machine Learning (ICML’07) 113–120. ACM, New York.
  • Caron, F. and Doucet, A. (2012). Efficient Bayesian inference for generalized Bradley–Terry models. J. Comput. Graph. Statist. 21 174–196.
  • Caron, F. and Teh, Y. W. (2012). Bayesian nonparametric models for ranked data. In Advances in Neural Information Processing Systems 25 1529–1537.
  • Chapman, R. and Staelin, R. (1982). Exploiting rank ordered choice set data within the stochastic utility model. J. Mark. Res. 19 288–301.
  • Clancy, P. and Kehoe, D. (1999). Financing third-level students in Ireland. Eur. J. Educ. 34 43–57.
  • Dahl, D. B. (2006). Model-based clustering for expression data via a Dirichlet process mixture model. In Bayesian Inference for Gene Expression and Proteomics (K. Do, P. Muller and M. Vannucci, eds.) 201–218. Cambridge Univ. Press, Cambridge.
  • Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B 39 1–38.
  • Devroye, L. (2009). Random variate generation for exponentially and polynomially tilted stable distributions. ACM Trans. Model. Comput. Simul. 19 18:1–18:20.
  • Diaconis, P. (1988). Group Representations in Probability and Statistics. Institute of Mathematical Statistics Lecture Notes 11. IMS, Hayward, CA.
  • Fall, M. D. and Barat, E. (2012). Gibbs sampling methods for Pitman–Yor mixture models. Technical report, INRIA.
  • Favaro, S. and Teh, Y. W. (2013). MCMC for normalized random measure mixture models. Statist. Sci. 28 335–359.
  • Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209–230.
  • Gormley, I. C. and Murphy, T. B. (2006). Analysis of Irish third-level college applications data. J. Roy. Statist. Soc. Ser. A 169 361–379.
  • Gormley, I. C. and Murphy, T. B. (2008). Exploring voting blocs within the Irish electorate: A mixture modeling approach. J. Amer. Statist. Assoc. 103 1014–1027.
  • Gormley, I. C. and Murphy, T. B. (2009). A grade of membership model for rank data. Bayesian Anal. 4 265–295.
  • Griffin, J. E. and Walker, S. G. (2011). Posterior simulation of normalized random measure mixtures. J. Comput. Graph. Statist. 20 241–259.
  • Guiver, J. and Snelson, E. (2009). Bayesian inference for Plackett–Luce ranking models. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML’09) 377–384. ACM, New York.
  • Hjort, N. L. (1990). Nonparametric Bayes estimators based on beta processes in models for life history data. Ann. Statist. 18 1259–1294.
  • Hunter, D. R. (2004). MM algorithms for generalized Bradley–Terry models. Ann. Statist. 32 384–406.
  • Hyland, A. (1999). Commission on the Points System: Final Report and Recommendations. Commission on the Points System Reports. The Stationery Office, Dublin, Ireland.
  • Ishwaran, H. and Zarepour, M. (2002). Exact and approximate sum representations for the Dirichlet process. Canad. J. Statist. 30 269–283.
  • James, L. F. (2002). Poisson process partition calculus with applications to exchangeable models and Bayesian nonparametrics. Preprint. Available at arXiv:math/0205093.
  • James, L. F., Lijoi, A. and Prünster, I. (2009). Posterior analysis for normalized random measures with independent increments. Scand. J. Stat. 36 76–97.
  • Kalli, M., Griffin, J. E. and Walker, S. G. (2011). Slice sampling mixture models. Stat. Comput. 21 93–105.
  • Kingman, J. F. C. (1967). Completely random measures. Pacific J. Math. 21 59–78.
  • Kingman, J. F. C. (1975). Random discrete distribution. J. Roy. Statist. Soc. Ser. B 37 1–22.
  • Lange, K., Hunter, D. R. and Yang, I. (2000). Optimization transfer using surrogate objective functions. J. Comput. Graph. Statist. 9 1–59.
  • Lijoi, A., Mena, R. H. and Prünster, I. (2005). Hierarchical mixture modeling with normalized inverse-Gaussian priors. J. Amer. Statist. Assoc. 100 1278–1291.
  • Lijoi, A., Mena, R. H. and Prünster, I. (2007). Controlling the reinforcement in Bayesian nonparametric mixture models. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 715–740.
  • Lijoi, A. and Prünster, I. (2010). Models beyond the Dirichlet process. In Bayesian Nonparametrics (N. L. Hjort, P. M. C. Holmes and S. G. Walker, eds.) 80–136. Cambridge Univ. Press, Cambridge.
  • Lo, A. Y. (1984). On a class of Bayesian nonparametric estimates. I. Density estimates. Ann. Statist. 12 351–357.
  • Luce, R. D. (1959). Individual Choice Behavior: A Theoretical Analysis. Wiley, New York.
  • Luce, R. D. (1977). The choice axiom after twenty years. J. Math. Psych. 15 215–233.
  • McNicholas, P. D. (2007). Association rule analysis of CAO data. J. Stat. Soc. Inq. Soc. Irel. 36 44–83.
  • Meilă, M. and Bao, L. (2008). Estimation and clustering with infinite rankings. In Proceedings of the 24th Conference in Uncertainty in Artificial Intelligence (UAI 2008) 393–402. AUAI Press, Corvallis, OR.
  • Meilă, M. and Chen, H. (2010). Dirichlet process mixtures of generalized Mallows models. In Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI 2010) 358–367. AUAI Press, Corvallis, OR.
  • Mena, R. H. and Walker, S. G. (2009). On a construction of Markov models in continuous time. Metron 67 303–323.
  • Müller, P., Quintana, F. and Rosner, G. (2004). A method for combining inference across related nonparametric Bayesian models. J. R. Stat. Soc. Ser. B Stat. Methodol. 66 735–749.
  • Neal, R. M. (1992). Bayesian mixture modeling. In Proceedings of the Workshop on Maximum Entropy and Bayesian Methods of Statistical Analysis 11 197–211.
  • O’Connell, P. J., Clancy, P. and McCoy, S. (2006). Who Went to College in 2004? A National Survey of New Entrants to Higher Education. The Higher Education Authority, Dublin, Ireland.
  • Orbanz, P. (2009). Construction of nonparametric Bayesian models from parametric Bayes equations. In Advances in Neural Information Processing Systems 22 1392–1400.
  • Papaspiliopoulos, O. and Roberts, G. O. (2008). Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models. Biometrika 95 169–186.
  • Patil, G. P. and Taillie, C. (1977). Diversity as a concept and its implications for random communities. In Proceedings of the 41st Session of the International Statistical Institute (New Delhi, 1977), Vol. 2 47 497–515, 551–558. Bulletin of the International Statistical Institute.
  • Pitman, J. (1995). Exchangeable and partially exchangeable random partitions. Probab. Theory Related Fields 102 145–158.
  • Pitman, J. (2006). Combinatorial stochastic processes. In Ecole d’été de Probabilités de Saint-Flour XXXII-2002. Lecture Notes in Math. 1875. Springer, Berlin.
  • Pitt, M. K. and Walker, S. G. (2005). Constructing stationary time series models using auxiliary variables with applications. J. Amer. Statist. Assoc. 100 554–564.
  • Plackett, R. L. (1975). The analysis of permutations. J. R. Stat. Soc. Ser. C. Appl. Stat. 24 193–202.
  • Prünster, I. (2002). Random probability measures derived from increasing additive processes and their application to Bayesian statistics. Ph.D. thesis, Univ. Pavia.
  • Rasmussen, C. E. (2000). The infinite Gaussian mixture model. In Advances in Neural Information Processing Systems 12 554–560.
  • Regazzini, E., Lijoi, A. and Prünster, I. (2003). Distributional results for means of normalized random measures with independent increments. Ann. Statist. 31 560–585.
  • Rodríguez, A., Dunson, D. B. and Gelfand, A. E. (2008). The nested Dirichlet process. J. Amer. Statist. Assoc. 103 1131–1144.
  • Teh, Y. W., Jordan, M. I., Beal, M. J. and Blei, D. M. (2006). Hierarchical Dirichlet processes. J. Amer. Statist. Assoc. 101 1566–1581.
  • Tuohy, D. (1998). Demand for Third-Level Places. Commission on the Points System Research Papers 1. The Stationery Office, Dublin, Ireland.
  • van Dyk, D. A. and Park, T. (2008). Partially collapsed Gibbs samplers: Theory and methods. J. Amer. Statist. Assoc. 103 790–796.
  • Walker, S. G. (2007). Sampling the Dirichlet mixture model with slices. Comm. Statist. Simulation Comput. 36 45–54.
  • West, M. (1992). Hyperparameter estimation in Dirichlet process mixture models. Technical Report 1992-03, Institute of Statistics and Decision Sciences, Duke Univ., Durham, NC.