## Annals of Statistics

### Worst-case versus average-case design for estimation from partial pairwise comparisons

#### Abstract

Pairwise comparison data arises in many domains, including tournament rankings, web search and preference elicitation. Given noisy comparisons of a fixed subset of pairs of items, we study the problem of estimating the underlying comparison probabilities under the assumption of strong stochastic transitivity (SST). We also consider the noisy sorting subclass of the SST model. We show that when the assignment of items to the topology is arbitrary, these permutation-based models, unlike their parametric counterparts, do not admit consistent estimation for most comparison topologies used in practice. We then demonstrate that consistent estimation is possible when the assignment of items to the topology is randomized, thus establishing a dichotomy between worst-case and average-case designs. We propose two computationally efficient estimators in the average-case setting and analyze their risk, showing that it depends on the comparison topology only through the degree sequence of the topology. We also provide explicit classes of graphs for which the rates achieved by these estimators are optimal. Our results are corroborated by simulations on multiple comparison topologies.

#### Article information

Source
Ann. Statist., Volume 48, Number 2 (2020), 1072-1097.

Dates
Revised: October 2018
First available in Project Euclid: 26 May 2020

https://projecteuclid.org/euclid.aos/1590480046

Digital Object Identifier
doi:10.1214/19-AOS1838

Mathematical Reviews number (MathSciNet)
MR4102688

#### Citation

Pananjady, Ashwin; Mao, Cheng; Muthukumar, Vidya; Wainwright, Martin J.; Courtade, Thomas A. Worst-case versus average-case design for estimation from partial pairwise comparisons. Ann. Statist. 48 (2020), no. 2, 1072--1097. doi:10.1214/19-AOS1838. https://projecteuclid.org/euclid.aos/1590480046

#### References

• [1] Ballinger, T. P. and Wilcox, N. T. (1997). Decisions, error and heterogeneity. Econ. J. 107 1090–1105.
• [2] Baltrunas, L., Makcinskas, T. and Ricci, F. (2010). Group recommendations with rank aggregation and collaborative filtering. In Proceedings of the Fourth ACM Conference on Recommender Systems. RecSys ’10 119–126. ACM, New York.
• [3] Barabási, A.-L. and Albert, R. (1999). Emergence of scaling in random networks. Science 286 509–512.
• [4] Barnett, W. (2003). The modern theory of consumer behavior: Ordinal or cardinal? Q. J. Austrian Econ. 6 41–65.
• [5] Bradley, R. A. and Terry, M. E. (1952). Rank analysis of incomplete block designs. I. The method of paired comparisons. Biometrika 39 324–345.
• [6] Braverman, M. and Mossel, E. (2008). Noisy sorting without resampling. In Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms 268–276. ACM, New York.
• [7] Caplin, A. and Nalebuff, B. (1991). Aggregation and social choice: A mean voter theorem. Econometrica 59 1–23.
• [8] Cattelan, M. (2012). Models for paired comparison data: A review with emphasis on dependent data. Statist. Sci. 27 412–433.
• [9] Chatterjee, S. (2015). Matrix estimation by universal singular value thresholding. Ann. Statist. 43 177–214.
• [10] Chatterjee, S. and Mukherjee, S. (2019). Estimation in tournaments and graphs under monotonicity constraints. IEEE Trans. Inform. Theory 65 3525-3539.
• [11] Chen, X., Bennett, P. N., Collins-Thompson, K. and Horvitz, E. (2013). Pairwise ranking aggregation in a crowdsourced setting. In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining 193–202. ACM.
• [12] Chen, X., Gopi, S., Mao, J. and Schneider, J. (2017). Competitive analysis of the top-$K$ ranking problem. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms 1245–1264. SIAM, Philadelphia, PA.
• [13] Chen, Y. and Suh, C. (2015). Spectral MLE: Top-k rank aggregation from pairwise comparisons. In International Conference on Machine Learning 371–380.
• [14] Diaconis, P. and Graham, R. L. (1977). Spearman’s footrule as a measure of disarray. J. Roy. Statist. Soc. Ser. B 39 262–268.
• [15] Dwork, C., Kumar, R., Naor, M. and Sivakumar, D. (2001). Rank aggregation methods for the web. In Proceedings of the 10th International Conference on World Wide Web 613–622. ACM, New York.
• [16] Fishburn, P. C. (1973). Binary choice probabilities: On the varieties of stochastic transitivity. J. Math. Psych. 10 327–352.
• [17] Flammarion, N., Mao, C. and Rigollet, P. (2019). Optimal rates of statistical seriation. Bernoulli 25 623–653.
• [18] Fligner, M. A. and Verducci, J. S., eds. (1993). Probability Models and Statistical Analyses for Ranking Data. Lecture Notes in Statistics 80. Springer, New York.
• [19] Hajek, B., Oh, S. and Xu, J. (2014). Minimax-optimal inference from partial rankings. In Advances in Neural Information Processing Systems 1475–1483.
• [20] Hakimi, S. L. (1962). On realizability of a set of integers as degrees of the vertices of a linear graph. I. J. Soc. Indust. Appl. Math. 10 496–506.
• [21] Havel, V. (1955). A remark on the existence of finite graphs. Čas. Pěst. Mat. 80 477–480.
• [22] Herbrich, R., Minka, T. and Graepel, T. (2006). Trueskill™: A Bayesian skill rating system. In Proceedings of the 19th International Conference on Neural Information Processing Systems 569–576. MIT Press.
• [23] Jamieson, K. G. and Nowak, R. D. (2011). Active ranking using pairwise comparisons. In Advances in Neural Information Processing Systems 2240–2248.
• [24] Jang, M., Kim, S., Suh, C., and Oh, S. (2017). Optimal sample complexity of m-wise data for top-k ranking. In Advances in Neural Information Processing Systems 1686–1696.
• [25] Kendall, M. G. (1948). Rank correlation methods. Oxford Univerosty Press, New York.
• [26] Khetan, A. and Oh, S. (2016). Data-driven rank breaking for efficient rank aggregation. J. Mach. Learn. Res. 17 193.
• [27] Király, F. J., Theran, L. and Tomioka, R. (2015). The algebraic combinatorial approach for low-rank matrix completion. J. Mach. Learn. Res. 16 1391–1436.
• [28] Luce, R. D. (1959). Individual Choice Behavior: A Theoretical Analysis. Wiley, New York.
• [29] Mao, C., Pananjady, A. and Wainwright, M. J. (2018). Breaking the $1/\sqrt{n}$ barrier: Faster rates for permutation-based models in polynomial time. In Proceedings of the 31st Conference on Learning Theory (S. Bubeck, V. Perchet and P. Rigollet, eds.). Proceedings of Machine Learning Research 75 2037–2042.
• [30] Mao, C., Pananjady, A. and Wainwright, M. J. (2019+). Towards optimal estimation of bivariate isotonic matrices with unknown permutations. Ann. Statist. To appear.
• [31] Mao, C., Weed, J. and Rigollet, P. (2018). Minimax rates and efficient algorithms for noisy sorting. In Algorithmic Learning Theory 2018. Proc. Mach. Learn. Res. (PMLR) 83 821–847. Proceedings of Machine Learning Research PMLR.
• [32] Marden, J. I. (1995). Analyzing and Modeling Rank Data. Monographs on Statistics and Applied Probability 64. CRC Press, London.
• [33] Maystre, L. and Grossglauser, M. (2017). Just sort it! A simple and effective approach to active preference learning. In International Conference on Machine Learning 2344–2353.
• [34] McLaughlin, D. H. and Luce, R. D. (1965). Stochastic transitivity and cancellation of preferences between bitter-sweet solutions. Psychol. Sci. 2 1–12. 89–90.
• [35] Negahban, S., Oh, S. and Shah, D. (2017). Rank centrality: Ranking from pairwise comparisons. Oper. Res. 65 266–287.
• [36] Negahban, S., Oh, S., Thekumparampil, K. K. and Xu, J. (2018). Learning from comparisons and choices. J. Mach. Learn. Res. 19 40.
• [37] Neyman, J. and Pearson, E. S. (1966). Joint statistical papers. Univ. California.
• [38] Pananjady, A., Mao, C., Muthukumar, V., Wainwright, M. J. and Courtade, T. A. (2020). Supplement to “Worst-case versus Average-case Design for Estimation from Partial Pairwise Comparisons.” https://doi.org/10.1214/19-AOS1838SUPP.
• [39] Pananjady, A., Wainwright, M. J. and Courtade, T. A. (2017). Denoising linear models with permuted data. In 2017 IEEE International Symposium on Information Theory (ISIT) 446–450. IEEE.
• [40] Park, D., Neeman, J., Zhang, J., Sanghavi, S. and Dhillon, I. (2015). Preference completion: Large-scale collaborative ranking from pairwise comparisons. In International Conference on Machine Learning 1907–1916.
• [41] Piech, C., Huang, J., Chen, Z., Do, C., Ng, A. and Koller, D. (2013). Tuned models of peer assessment in MOOCs. In Proceedings of the 6th International Conference on Educational Data Mining 153–160.
• [42] Pimentel-Alarcón, D. L., Boston, N. and Nowak, R. D. (2016). A characterization of deterministic sampling patterns for low-rank matrix completion. IEEE J. Sel. Top. Signal Process. 10 623–636.
• [43] Rajkumar, A. and Agarwal, S. (2016). When can we rank well from comparisons of ${O}(n\log (n))$ non-actively chosen pairs? In 29th COLT 49 1376–1401.
• [44] Rigollet, P. and Weed, J. (2019+). Uncoupled isotonic regression via minimum Wasserstein deconvolution. Information and Inference: A Journal of the IMA To appear.
• [45] Shah, N. B., Balakrishnan, S., Bradley, J., Parekh, A., Ramchandran, K. and Wainwright, M. J. (2016). Estimation from pairwise comparisons: Sharp minimax bounds with topology dependence. J. Mach. Learn. Res. 17 58.
• [46] Shah, N. B., Balakrishnan, S., Guntuboyina, A. and Wainwright, M. J. (2017). Stochastically transitive models for pairwise comparisons: Statistical and computational issues. IEEE Trans. Inform. Theory 63 934–959.
• [47] Shah, N. B., Balakrishnan, S. and Wainwright, M. J. (2016). Feeling the Bern: Adaptive estimators for Bernoulli probabilities of pairwise comparisons. In Information Theory (ISIT), 2016 IEEE International Symposium on 1153–1157. IEEE.
• [48] Shah, N. B., Balakrishnan, S. and Wainwright, M. J. (2016). A permutation-based model for crowd labeling: Optimal estimation and robustness. ArXiv preprint. Available at arXiv:1606.09632.
• [49] Shah, N. B., Bradley, J. K., Parekh, A., Wainwright, M. and Ramchandran, K. (2013). A case for ordinal peer-evaluation in MOOCs. In NIPS Workshop on Data Driven Education.
• [50] Shah, N. B. and Wainwright, M. J. (2017). Simple, robust and optimal ranking from pairwise comparisons. J. Mach. Learn. Res. 18 199.
• [51] Srebro, N. and Salakhutdinov, R. R. (2010). Collaborative filtering in a non-uniform world: Learning with the weighted trace norm. In Advances in Neural Information Processing Systems 2056–2064.
• [52] Stewart, N., Brown, G. D. A. and Chater, N. (2005). Absolute identification by relative judgment. Psychol. Rev. 112 881–911.
• [53] Thurstone, L. L. (1927). A law of comparative judgment. Psychol. Rev. 34 273.
• [54] Wainwright, M. J. (2019). High-Dimensional Statistics: A Non-asymptotic Viewpoint. Cambridge Univ. Press, Cambridge, UK.
• [55] Wauthier, F., Jordan, M. and Jojic, N. (2013). Efficient ranking from pairwise comparisons. In International Conference on Machine Learning 109–117.
• [56] Yu, A. and Grauman, K. (2014). Fine-grained visual comparisons with local learning. In Computer Vision and Pattern Recognition (CVPR) 192–199.

#### Supplemental materials

• Supplement to “Worst-case versus average-case design for estimation from partial pairwise comparisons”. Due to space constraints, we have relegated the technical details of remaining proofs to the supplement [38]. The supplement also contains a section characterizing the minimax denoising error under partial observations.