Electronic Journal of Statistics

On kernel methods for covariates that are rankings

Horia Mania, Aaditya Ramdas, Martin J. Wainwright, Michael I. Jordan, and Benjamin Recht

Full-text: Open access


Permutation-valued features arise in a variety of applications, either in a direct way when preferences are elicited over a collection of items, or an indirect way when numerical ratings are converted to a ranking. To date, there has been relatively limited study of regression, classification, and testing problems based on permutation-valued features, as opposed to permutation-valued responses. This paper studies the use of reproducing kernel Hilbert space methods for learning from permutation-valued features. These methods embed the rankings into an implicitly defined function space, and allow for efficient estimation of regression and test functions in this richer space. We characterize both the feature spaces and spectral properties associated with two kernels for rankings, the Kendall and Mallows kernels. Using tools from representation theory, we explain the limited expressive power of the Kendall kernel by characterizing its degenerate spectrum, and in sharp contrast, we prove that the Mallows kernel is universal and characteristic. We also introduce families of polynomial kernels that interpolate between the Kendall (degree one) and Mallows (infinite degree) kernels. We show the practical effectiveness of our methods via applications to Eurobarometer survey data as well as a Movielens ratings dataset.

Article information

Electron. J. Statist., Volume 12, Number 2 (2018), 2537-2577.

Received: September 2017
First available in Project Euclid: 14 August 2018

Permanent link to this document

Digital Object Identifier

Mallows kernel Kendall kernel polynomial kernel representation theory Fourier analysis symmetric group

Creative Commons Attribution 4.0 International License.


Mania, Horia; Ramdas, Aaditya; Wainwright, Martin J.; Jordan, Michael I.; Recht, Benjamin. On kernel methods for covariates that are rankings. Electron. J. Statist. 12 (2018), no. 2, 2537--2577. doi:10.1214/18-EJS1437. https://projecteuclid.org/euclid.ejs/1534233701

Export citation


  • [1] Brussels European Opinion Research Group. Eurobarometer 55.2 (May–June 2001), 2012.
  • [2] Erich L. Lehmann and Howard J.M. D’Abrera., Nonparametrics: Statistical Methods Based on Ranks. Springer New York, 2006.
  • [3] Brian Francis, Regina Dittrich, and Reinhold Hatzinger. Modeling heterogeneity in ranked responses by nonparametric maximum likelihood: How do Europeans get their scientific knowledge?, The Annals of Applied Statistics, 4 :2181–2202, 2010.
  • [4] Ralph Allan Bradley and Milton E. Terry. Rank analysis of incomplete block designs the method of paired comparisons., Biometrika, 39:324–345, 1952.
  • [5] Yunlong Jiao and Jean-Philippe Vert. The Kendall and Mallows kernels for permutations., Proceedings of the International Conference on Machine Learning, 32 :1935–1944, 2015.
  • [6] George Kimeldorf and Grace Wahba. Some results on Tchebycheffian spline functions., Journal of Mathematical Analysis and Applications, 33:82–95, 1971.
  • [7] Bernhard Schölkopf and Alex J. Smola., Learning with Kernels. MIT Press, Cambridge, MA, 2002.
  • [8] Kenji Fukumizu, Arthur Gretton, Bernhard Schölkopf, and Bharath K Sriperumbudur. Characteristic kernels on groups and semigroups. In, Advances in Neural Information Processing Systems, pages 473–480, 2009.
  • [9] Risi Imre Kondor and John Lafferty. Diffusion kernels on graphs and other discrete structures., Proceedings of the International Conference on Machine Learning, 19:315–322, 2002.
  • [10] Risi Imre Kondor and Marconi S. Barbosa. Ranking with kernels in Fourier space., Conference on Learning Theory, 23:451–463, 2010.
  • [11] Risi Imre Kondor. Group theoretical methods in machine learning., unpublished Ph.D. dissertation, Columbia University, 2008.
  • [12] Persi Diaconis. Group representations in probability and statistics., IMS Lecture Notes-Monograph Series, 11:i–192, 1988.
  • [13] William Fulton and Joe Harris., Representation Theory, volume 129. Springer Science & Business Media, 1991.
  • [14] Ingo Steinwart. On the influence of the kernel on the consistency of support vector machines., The Journal of Machine Learning Research, 2:67–93, 2002.
  • [15] Alfred Müller. Integral probability metrics and their generating classes of functions., Advances in Applied Probability, 29(2):429–443, 1997.
  • [16] Svetlozar T. Rachev, Lev Klebanov, Stoyan V. Stoyanov, and Frank Fabozzi., The Methods of Distances in the Theory of Probability and Statistics. Springer Science & Business Media, 2013.
  • [17] Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Schölkopf, and Alexander Smola. A kernel two-sample test., Journal of Machine Learning Research, 13:723–773, 2012.
  • [18] Jonathan Huang, Carlos Guestrin, and Leonidas Guibas. Fourier theoretic probabilistic inference over permutations., Journal of Machine Learning Research, 10:997 –1070, 2009.
  • [19] Bruce Sagan., The Symmetric Group: Representations, Combinatorial Algorithms, and Symmetric Functions, volume 203. Springer Science & Business Media, 2013.
  • [20] Martin J. Wainwright., High-dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge University Press, 2017.
  • [21] Song Xi Chen and Ying-Li Qin. A two-sample test for high-dimensional data with applications to gene-set testing., The Annals of Statistics, 38:808–835, 2010.
  • [22] Aaditya Ramdas, Sashank J. Reddi, Barnabas Poczos, Aarti Singh, and Larry Wasserman. Adaptivity and computation-statistics tradeoffs for kernel and distance based high dimensional two sample testing., arXiv preprint arXiv:1508.00655, 2015.
  • [23] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Brucher Matthieu, Matthieu Perrot, and Édouard Duchesnay. Scikit-learn: Machine learning in python., Journal of Machine Learning Research, 12 :2825–2830, 2011.
  • [24] Andreas Christmann and Ingo Steinwart. Universal kernels on non-standard input spaces., Advances in Neural Information Processing Systems, pages 406–414, 2010.