Representing distributions over permutations can be a daunting task due to the fact that the number of permutations of n objects scales factorially in n. One recent way that has been used to reduce storage complexity has been to exploit probabilistic independence, but as we argue, full independence assumptions impose strong sparsity constraints on distributions and are unsuitable for modeling rankings. We identify a novel class of independence structures, called riffled independence, encompassing a more expressive family of distributions while retaining many of the properties necessary for performing efficient inference and reducing sample complexity. In riffled independence, one draws two permutations independently, then performs the riffle shuffle, common in card games, to combine the two permutations to form a single permutation. Within the context of ranking, riffled independence corresponds to ranking disjoint sets of objects independently, then interleaving those rankings. In this paper, we provide a formal introduction to riffled independence and propose an automated method for discovering sets of items which are riffle independent from a training set of rankings. We show that our clustering-like algorithms can be used to discover meaningful latent coalitions from real preference ranking datasets and to learn the structure of hierarchically decomposable models based on riffled independence.
 Bach, F. R. and Jordan, M. I. (2001). Thin Junction Trees. In, Advances in Neural Information Processing Systems 14 569–576. MIT Press.
 Bayer, D. and Diaconis, P. (1992). Trailing the Dovetail Shuffle to its Lair., The Annals of Probability 2 294–313.
 Chechetka, A. and Guestrin, C. (2008). Efficient Principled Learning of Thin Junction Trees. In, Advances in Neural Information Processing Systems 20 (J. Platt, D. Koller, Y. Singer and S. Roweis, eds.) 273–280. MIT Press, Cambridge, MA.
 Chen, H., Branavan, S. R. K., Barzilay, R. and Karger, D. R. (2009). Global models of document structure using latent permutations. In, Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. NAACL ’09 371–379. Association for Computational Linguistics, Stroudsburg, PA, USA.
 Clausen, M. and Baum, U. (1993). Fast Fourier Transforms for Symmetric Groups: Theory and Implementation., Mathematics of Computations 61 833-847.
 Diaconis, P. (1988)., Group Representations in Probability and Statistics. Institute of Mathematical Sciences (Lecture Notes).
Mathematical Reviews (MathSciNet): MR964069
 Diaconis, P. (1989). A Generalization of Spectral Analysis with Application to Ranked Data., The Annals of Statistics 17 949-979.
 Farias, V., Jagabathula, S. and Shah, D. (2009). A Data-Driven Approach to Modeling Choice. In, Advances in Neural Information Processing Systems 22 (Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams and A. Culotta, eds.) 504–512.
 Fligner, M. and Verducci, J. (1986). Distance-based Ranking models., Journal of the Royal Statistical Society, Series B 83 859-869.
Mathematical Reviews (MathSciNet): MR876847
 Fligner, M. and Verducci, J. (1988). Multistage Ranking Models., Journal of the American Statistical Association 83.
Mathematical Reviews (MathSciNet): MR963820
 Gallo, G., Longo, G., Pallottino, S. and Nguyen, S. (1993). Directed hypergraphs and applications., Discrete Applied Mathematics 42 177-201.
 Gormley, I. C. and Murphy, T. B. (2007). A latent space model for rank data. In, Proceedings of the 24th Annual International Conference on Machine Learning. ICML’06 90–102. ACM, New York, NY, USA.
 Guiver, J. and Snelson, E. (2009). Bayesian inference for Plackett-Luce ranking models. In, Proceedings of the 26th Annual International Conference on Machine Learning. ICML ’09 377–384. ACM, New York, NY, USA.
 Höffgen, K.-U. (1993). Learning and robust learning of product distributions. In, Proceedings of the sixth annual conference on Computational learning theory. COLT ’93 77–83. ACM, New York, NY, USA.
 Huang, J. and Guestrin, C. (2009a). Riffled Independence for Ranked Data. In, Advances in Neural Information Processing Systems 22 (Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams and A. Culotta, eds.) 799–807.
 Huang, J., Guestrin, C. and Guibas, L. (2009b). Fourier Theoretic Probabilistic Inference over Permutations., Journal of Machine Learning (JMLR) 10 997-1070.
 Huang, J. and Guestrin, C. (2010). Learning Hierarchical Riffle Independent Groupings from Rankings. In, Proceedings of the 27th Annual International Conference on Machine Learning. ICML ’10 455–462.
 Huang, J., Guestrin, C., Jiang, X. and Guibas, L. J. (2009). Exploiting Probabilistic Independence for Permutations., Journal of Machine Learning Research - Proceedings Track 5 248-255.
 Huang, J. and Guestrin, C. (2012). Uncovering the Riffled Independence Structure of Ranked Data: Supplementary Material. DOI:, 10.1214/12-EJS670SUPP
 Jagabathula, S. and Shah, D. (2009). Inferring rankings under constrained sensing. In, Advances in Neural Information Processing Systems 21 (D. Koller, D. Schuurmans, Y. Bengio and L. Bottou, eds.) 753–760.
 Kamishima, T. (2003). Nantonac collaborative filtering: recommendation based on order responses. In, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. KDD ’03 583–588. ACM, New York, NY, USA.
 Koller, D. and Friedman, N. (2009)., Probabilistic Graphical Models: Principles and Techniques. MIT Press.
 Kondor, R. (2008). Group Theoretical Methods in Machine Learning PhD thesis, Columbia, University.
 Mallows, C. (1957). Non-null ranking models., Biometrika 44 114-130.
Mathematical Reviews (MathSciNet): MR87267
 Marden, J. I. (1995)., Analyzing and Modeling Rank Data. Chapman & Hall.
 Maslen, D. (1998). The efficient computation of Fourier transforms on the Symmetric group., Mathematics of Computation 67 1121-1147.
 Meila, M., Phadnis, K., Patterson, A. and Bilmes, J. (2007). Consensus ranking under the exponential model Technical Report No. UW, TR-515.
 Motwani, R. and Raghavan, P. (1996). Randomized algorithms., ACM Computational Surveys 28.
 Plackett, R. (1975). The analysis of permutations., Applied Statistics 24 193-202.
Mathematical Reviews (MathSciNet): MR391338
 Reid, D. (1979). An algorithm for tracking multiple targets., IEEE Transactions on Automatic Control 6 843–854.
 Rockmore, D. N. (2000). The FFT: An Algorithm the Whole Family Can Use., Computing in Science and Engineering 02 60-64.
 Shahaf, D., Chechetka, A. and Guestrin, C. (2009). Learning Thin Junction Trees via Graph Cuts., Journal of Machine Learning Research - Proceedings Track 5 113-120.
 Shin, J., Lee, N., Thrun, S. and Guibas, L. (2005). Lazy inference on object identities in wireless sensor networks. In, Proceedings of the 4th international symposium on Information processing in sensor networks. IPSN ’05. IEEE Press, Piscataway, NJ, USA.
 Sun, M., Lebanon, G. and Collins-Thompson, K. (2010). Visualizing differences in web search algorithms using the expected weighted hoeffding distance. In, Proceedings of the 19th international conference on World wide web. WWW ’10 931–940. ACM, New York, NY, USA.
 Terras, A. (1999)., Fourier Analysis on Finite Groups and Applications. London Mathematical Society.
 Thurstone, L. (1927). A law of comparative judgement., Psychological Review 34 273-286.