We consider the predictive problem of supervised ranking, where the task is to rank sets of candidate items returned in response to queries. Although there exist statistical procedures that come with guarantees of consistency in this setting, these procedures require that individuals provide a complete ranking of all items, which is rarely feasible in practice. Instead, individuals routinely provide partial preference information, such as pairwise comparisons of items, and more practical approaches to ranking have aimed at modeling this partial preference data directly. As we show, however, such an approach raises serious theoretical challenges. Indeed, we demonstrate that many commonly used surrogate losses for pairwise comparison data do not yield consistency; surprisingly, we show inconsistency even in low-noise settings. With these negative results as motivation, we present a new approach to supervised ranking based on aggregation of partial preferences, and we develop $U$-statistic-based empirical risk minimization procedures. We present an asymptotic analysis of these new procedures, showing that they yield consistency results that parallel those available for classification. We complement our theoretical results with an experiment studying the new procedures in a large-scale web-ranking task.

## References

*The*49

*th Allerton Conference on Communication*,

*Control*,

*and Computing*. IEEE, Washington, DC.[1] Ammar, A. and Shah, D. (2011). Ranking: Compare, don’t score. In

*The*49

*th Allerton Conference on Communication*,

*Control*,

*and Computing*. IEEE, Washington, DC.

*J. Amer. Statist. Assoc.*

**101**138–156. MR2268032 10.1198/016214505000000907[3] Bartlett, P. L., Jordan, M. I. and McAuliffe, J. D. (2006). Convexity, classification, and risk bounds.

*J. Amer. Statist. Assoc.*

**101**138–156. MR2268032 10.1198/016214505000000907

*Lectures on Modern Convex Optimization*:

*Analysis*,

*Algorithms*,

*and Engineering Applications*. SIAM, Philadelphia, PA. MR1857264[4] Ben-Tal, A. and Nemirovski, A. (2001).

*Lectures on Modern Convex Optimization*:

*Analysis*,

*Algorithms*,

*and Engineering Applications*. SIAM, Philadelphia, PA. MR1857264

*Biometrika*

**39**324–345. MR70925[5] Bradley, R. A. and Terry, M. E. (1952). Rank analysis of incomplete block designs. I. The method of paired comparisons.

*Biometrika*

**39**324–345. MR70925

*Proceedings of the*28

*th International Conference on Machine Learning*825–832. Omnipress, Madison, WI.[6] Buffoni, D., Calauzenes, C., Gallinari, P. and Usunier, N. (2011). Learning scoring functions with order-preserving losses and standardized supervision. In

*Proceedings of the*28

*th International Conference on Machine Learning*825–832. Omnipress, Madison, WI.

*Advances in Neural Information Processing Systems*14 359–366. MIT Press, Cambridge, MA.[7] Cesa-Bianchi, N., Conconi, A. and Gentile, C. (2002). On the generalization ability of on-line learning algorithms. In

*Advances in Neural Information Processing Systems*14 359–366. MIT Press, Cambridge, MA.

*Conference on Information and Knowledge Management*. ACM, New York.[8] Chapelle, O., Metzler, D., Zhang, Y. and Grinspan, P. (2009). Expected reciprocal rank for graded relevance. In

*Conference on Information and Knowledge Management*. ACM, New York.

*Spectral Graph Theory. CBMS Regional Conference Series in Mathematics*

**92**. Conference Board of the Mathematical Sciences, Washington, DC. MR1421568[9] Chung, F. R. K. (1997).

*Spectral Graph Theory. CBMS Regional Conference Series in Mathematics*

**92**. Conference Board of the Mathematical Sciences, Washington, DC. MR1421568

*Ann. Statist.*

**36**844–874. MR2396817 10.1214/009052607000000910 euclid.aos/1205420521 [10] Clémençon, S., Lugosi, G. and Vayatis, N. (2008). Ranking and empirical minimization of $U$-statistics.

*Ann. Statist.*

**36**844–874. MR2396817 10.1214/009052607000000910 euclid.aos/1205420521

*IEEE Trans. Inform. Theory*

**54**5140–5154. MR2589888 10.1109/TIT.2008.929939[12] Cossock, D. and Zhang, T. (2008). Statistical analysis of Bayes optimal subset ranking.

*IEEE Trans. Inform. Theory*

**54**5140–5154. MR2589888 10.1109/TIT.2008.929939

*Web Search and Data Mining*(

*WSDM*) 87–94. ACM, New York.[13] Craswell, N., Zoeter, O., Taylor, M. J. and Ramsey, B. (2008). An experimental comparison of click position-bias models. In

*Web Search and Data Mining*(

*WSDM*) 87–94. ACM, New York.

*J. Mach. Learn. Res.*

**10**2899–2934. MR2579916[17] Duchi, J. and Singer, Y. (2009). Efficient online and batch learning using forward backward splitting.

*J. Mach. Learn. Res.*

**10**2899–2934. MR2579916

*Proceedings of the*27

*th International Conference on Machine Learning*(

*ICML-*10) (J. Fürnkranz and T. Joachims, eds.) 327–334. Omnipress, Madison, WI.[19] Duchi, J. C., Mackey, L. and Jordan, M. I. (2010). On the consistency of ranking algorithms. In

*Proceedings of the*27

*th International Conference on Machine Learning*(

*ICML-*10) (J. Fürnkranz and T. Joachims, eds.) 327–334. Omnipress, Madison, WI.

*Proceedings of the Twenty Third Annual Conference on Computational Learning Theory*.[20] Duchi, J. C., Shalev-Shwartz, S., Singer, Y. and Tewari, A. (2010). Composite objective mirror descent. In

*Proceedings of the Twenty Third Annual Conference on Computational Learning Theory*.

*Proceedings of the Tenth International Conference on World Wide Web*(

*WWW*10) 613–622. ACM, New York.[21] Dwork, C., Kumar, R., Naor, M. and Sivakumar, D. (2001). Rank aggregation methods for the web. In

*Proceedings of the Tenth International Conference on World Wide Web*(

*WWW*10) 613–622. ACM, New York.

*J. Mach. Learn. Res.*

**4**933–969. MR2125342[22] Freund, Y., Iyer, R., Schapire, R. E. and Singer, Y. (2003). An efficient boosting algorithm for combining preferences.

*J. Mach. Learn. Res.*

**4**933–969. MR2125342

*Advances in Large Margin Classifiers*. MIT Press, Cambridge, MA. MR1820960[24] Herbrich, R., Graepel, T. and Obermayer, K. (2000). Large margin rank boundaries for ordinal regression. In

*Advances in Large Margin Classifiers*. MIT Press, Cambridge, MA. MR1820960

*Proceedings of the ACM Conference on Knowledge Discovery and Data Mining*. ACM, New York.[28] Joachims, T. (2002). Optimizing search engines using clickthrough data. In

*Proceedings of the ACM Conference on Knowledge Discovery and Data Mining*. ACM, New York.

*Psychological Review*

**63**81–97.[31] Miller, G. (1956). The magic number seven, plus or minus two: Some limits on our capacity for processing information.

*Psychological Review*

**63**81–97.

*Psychometrika*

**16**3–9.[32] Mosteller, F. (1951). Remarks on the method of paired comparisons: I. The least squares solution assuming equal standard deviations and equal correlations.

*Psychometrika*

**16**3–9.

*SIAM J. Optim.*

**19**1574–1609. MR2486041 10.1137/070704277[33] Nemirovski, A., Juditsky, A., Lan, G. and Shapiro, A. (2009). Robust stochastic approximation approach to stochastic programming.

*SIAM J. Optim.*

**19**1574–1609. MR2486041 10.1137/070704277

*Proceedings of the*14

*th International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings*

**15**618–626. Society for Artificial Intelligence and Statistics.[35] Ravikumar, P., Tewari, A. and Yang, E. (2011). On NDCG consistency of listwise ranking methods. In

*Proceedings of the*14

*th International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings*

**15**618–626. Society for Artificial Intelligence and Statistics.

*European J. Oper. Res.*

**145**85–91. MR1947158 10.1016/S0377-2217(02)00227-8[36] Saaty, T. L. (2003). Decision-making with the AHP: Why is the principal eigenvector necessary.

*European J. Oper. Res.*

**145**85–91. MR1947158 10.1016/S0377-2217(02)00227-8

*Rev. R. Acad. Cienc. Exactas FíS. Nat. Ser. A Math. RACSAM*

**102**251–318. MR2479460 10.1007/BF03191825[37] Saaty, T. L. (2008). Relative measurement and its generalization in decision making. Why pairwise comparisons are central in mathematics for the measurement of intangible factors. The analytic hierarchy/network process.

*Rev. R. Acad. Cienc. Exactas FíS. Nat. Ser. A Math. RACSAM*

**102**251–318. MR2479460 10.1007/BF03191825

*Constr. Approx.*

**26**225–287. MR2327600 10.1007/s00365-006-0662-3[40] Steinwart, I. (2007). How to compare different loss functions and their risks.

*Constr. Approx.*

**26**225–287. MR2327600 10.1007/s00365-006-0662-3

*Ann. Statist.*

**32**56–85. MR2051001 10.1214/aos/1079120130 euclid.aos/1079120130 [45] Zhang, T. (2004). Statistical behavior and consistency of classification methods based on convex risk minimization.

*Ann. Statist.*

**32**56–85. MR2051001 10.1214/aos/1079120130 euclid.aos/1079120130