Open Access
August 2019 Spectral method and regularized MLE are both optimal for top-$K$ ranking
Yuxin Chen, Jianqing Fan, Cong Ma, Kaizheng Wang
Ann. Statist. 47(4): 2204-2235 (August 2019). DOI: 10.1214/18-AOS1745
Abstract

This paper is concerned with the problem of top-$K$ ranking from pairwise comparisons. Given a collection of $n$ items and a few pairwise comparisons across them, one wishes to identify the set of $K$ items that receive the highest ranks. To tackle this problem, we adopt the logistic parametric model—the Bradley–Terry–Luce model, where each item is assigned a latent preference score, and where the outcome of each pairwise comparison depends solely on the relative scores of the two items involved. Recent works have made significant progress toward characterizing the performance (e.g., the mean square error for estimating the scores) of several classical methods, including the spectral method and the maximum likelihood estimator (MLE). However, where they stand regarding top-$K$ ranking remains unsettled.

We demonstrate that under a natural random sampling model, the spectral method alone, or the regularized MLE alone, is minimax optimal in terms of the sample complexity—the number of paired comparisons needed to ensure exact top-$K$ identification, for the fixed dynamic range regime. This is accomplished via optimal control of the entrywise error of the score estimates. We complement our theoretical studies by numerical experiments, confirming that both methods yield low entrywise errors for estimating the underlying scores. Our theory is established via a novel leave-one-out trick, which proves effective for analyzing both iterative and noniterative procedures. Along the way, we derive an elementary eigenvector perturbation bound for probability transition matrices, which parallels the Davis–Kahan $\mathop{\mathrm{sin}}\nolimits \Theta $ theorem for symmetric matrices. This also allows us to close the gap between the $\ell_{2}$ error upper bound for the spectral method and the minimax lower limit.

References

1.

Abbe, E., Fan, J., Wang, K. and Zhong, Y. (2017). Entrywise eigenvector analysis of random matrices with low expected rank. ArXiv preprint. Available at  ArXiv:1709.095651709.09565Abbe, E., Fan, J., Wang, K. and Zhong, Y. (2017). Entrywise eigenvector analysis of random matrices with low expected rank. ArXiv preprint. Available at  ArXiv:1709.095651709.09565

2.

Agarwal, A., Agarwal, S., Assadi, S. and Khanna, S. (2017). Learning with limited rounds of adaptivity: Coin tossing, multi-armed bandits, and ranking from pairwise comparisons. In Conference on Learning Theory 39–75.Agarwal, A., Agarwal, S., Assadi, S. and Khanna, S. (2017). Learning with limited rounds of adaptivity: Coin tossing, multi-armed bandits, and ranking from pairwise comparisons. In Conference on Learning Theory 39–75.

3.

Ammar, A. and Shah, D. (2011). Ranking: Compare, don’t score. In 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton) 776–783.  DOI:10.1109/Allerton.2011.6120246.Ammar, A. and Shah, D. (2011). Ranking: Compare, don’t score. In 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton) 776–783.  DOI:10.1109/Allerton.2011.6120246.

4.

Ammar, A. and Shah, D. (2012). Efficient rank aggregation using partial data. In SIGMETRICS 40 355–366. ACM, New York.Ammar, A. and Shah, D. (2012). Efficient rank aggregation using partial data. In SIGMETRICS 40 355–366. ACM, New York.

5.

Baltrunas, L., Makcinskas, T. and Ricci, F. (2010). Group recommendations with rank aggregation and collaborative filtering. In Proceedings of the Fourth ACM Conference on Recommender Systems. RecSys ’10 119–126. ACM, New York.Baltrunas, L., Makcinskas, T. and Ricci, F. (2010). Group recommendations with rank aggregation and collaborative filtering. In Proceedings of the Fourth ACM Conference on Recommender Systems. RecSys ’10 119–126. ACM, New York.

6.

Bradley, R. A. and Terry, M. E. (1952). Rank analysis of incomplete block designs. I. The method of paired comparisons. Biometrika 39 324–345. 0047.12903Bradley, R. A. and Terry, M. E. (1952). Rank analysis of incomplete block designs. I. The method of paired comparisons. Biometrika 39 324–345. 0047.12903

7.

Bubeck, S. (2015). Convex optimization: Algorithms and complexity. Found. Trends Mach. Learn. 8 231–357. 1365.90196 10.1561/2200000050Bubeck, S. (2015). Convex optimization: Algorithms and complexity. Found. Trends Mach. Learn. 8 231–357. 1365.90196 10.1561/2200000050

8.

Busa-Fekete, R., Szörényi, B., Weng, P., Cheng, W. and Hüllermeier, E. (2013). Top-$k$ selection based on adaptive sampling of noisy preferences. In International Conference on Machine Learning.Busa-Fekete, R., Szörényi, B., Weng, P., Cheng, W. and Hüllermeier, E. (2013). Top-$k$ selection based on adaptive sampling of noisy preferences. In International Conference on Machine Learning.

9.

Chen, Y. and Candes, E. (2016). The projected power method: An efficient algorithm for joint alignment from pairwise differences. Comm. Pure Appl. Math. To appear. 06919696 10.1002/cpa.21760Chen, Y. and Candes, E. (2016). The projected power method: An efficient algorithm for joint alignment from pairwise differences. Comm. Pure Appl. Math. To appear. 06919696 10.1002/cpa.21760

10.

Chen, Y. and Candès, E. J. (2017). Solving random quadratic systems of equations is nearly as easy as solving linear systems. Comm. Pure Appl. Math. 70 822–883. 1379.90024 10.1002/cpa.21638Chen, Y. and Candès, E. J. (2017). Solving random quadratic systems of equations is nearly as easy as solving linear systems. Comm. Pure Appl. Math. 70 822–883. 1379.90024 10.1002/cpa.21638

11.

Chen, Y. and Suh, C. (2015). Spectral MLE: Top-$K$ rank aggregation from pairwise comparisons. In International Conference on Machine Learning 371–380.Chen, Y. and Suh, C. (2015). Spectral MLE: Top-$K$ rank aggregation from pairwise comparisons. In International Conference on Machine Learning 371–380.

12.

Chen, X., Bennett, P. N., Collins-Thompson, K. and Horvitz, E. (2013). Pairwise ranking aggregation in a crowdsourced setting. In ACM International Conference on Web Search and Data Mining 193–202. ACM, New York.Chen, X., Bennett, P. N., Collins-Thompson, K. and Horvitz, E. (2013). Pairwise ranking aggregation in a crowdsourced setting. In ACM International Conference on Web Search and Data Mining 193–202. ACM, New York.

13.

Chen, X., Gopi, S., Mao, J. and Schneider, J. (2017). Competitive analysis of the top-$K$ ranking problem. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms 1245–1264. SIAM, Philadelphia, PA. 06904109Chen, X., Gopi, S., Mao, J. and Schneider, J. (2017). Competitive analysis of the top-$K$ ranking problem. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms 1245–1264. SIAM, Philadelphia, PA. 06904109

14.

Chen, Y., Chi, Y., Fan, J. and Ma, C. (2018). Gradient descent with random initialization: Fast global convergence for nonconvex phase retrieval. Available at  ArXiv:1803.077261803.07726Chen, Y., Chi, Y., Fan, J. and Ma, C. (2018). Gradient descent with random initialization: Fast global convergence for nonconvex phase retrieval. Available at  ArXiv:1803.077261803.07726

15.

Chen, Y., Fan, J., Ma, C. and Wang, K. (2019). Supplement to “Spectral method and regularized MLE are both optimal for top-$K$ ranking.”  DOI:10.1214/18-AOS1745SUPP.Chen, Y., Fan, J., Ma, C. and Wang, K. (2019). Supplement to “Spectral method and regularized MLE are both optimal for top-$K$ ranking.”  DOI:10.1214/18-AOS1745SUPP.

16.

Chung, F. R. K. (1997). Spectral Graph Theory. CBMS Regional Conference Series in Mathematics 92. Published for the Conference Board of the Mathematical Sciences, Washington, DC; by the Amer. Math. Soc., Providence, RI. 0867.05046Chung, F. R. K. (1997). Spectral Graph Theory. CBMS Regional Conference Series in Mathematics 92. Published for the Conference Board of the Mathematical Sciences, Washington, DC; by the Amer. Math. Soc., Providence, RI. 0867.05046

17.

Davis, C. and Kahan, W. M. (1970). The rotation of eigenvectors by a perturbation. III. SIAM J. Numer. Anal. 7 1–46. 0198.47201 10.1137/0707001Davis, C. and Kahan, W. M. (1970). The rotation of eigenvectors by a perturbation. III. SIAM J. Numer. Anal. 7 1–46. 0198.47201 10.1137/0707001

18.

Dwork, C., Kumar, R., Naor, M. and Sivakumar, D. (2001). Rank aggregation methods for the Web. In International Conference on World Wide Web 613–622.Dwork, C., Kumar, R., Naor, M. and Sivakumar, D. (2001). Rank aggregation methods for the Web. In International Conference on World Wide Web 613–622.

19.

El Karoui, N. (2018). On the impact of predictor geometry on the performance on high-dimensional ridge-regularized generalized robust regression estimators. Probab. Theory Related Fields 170 95–175. 06837236 10.1007/s00440-016-0754-9El Karoui, N. (2018). On the impact of predictor geometry on the performance on high-dimensional ridge-regularized generalized robust regression estimators. Probab. Theory Related Fields 170 95–175. 06837236 10.1007/s00440-016-0754-9

20.

Eldridge, J., Belkin, M. and Wang, Y. (2017). Unperturbed: Spectral analysis beyond Davis–Kahan. ArXiv preprint. Available at  ArXiv:1706.065161706.06516Eldridge, J., Belkin, M. and Wang, Y. (2017). Unperturbed: Spectral analysis beyond Davis–Kahan. ArXiv preprint. Available at  ArXiv:1706.065161706.06516

21.

Fan, J., Wang, W. and Zhong, Y. (2018). An $\ell _{\infty}$ eigenvector perturbation bound and its application. J. Mach. Learn. Res. 18 1–42.Fan, J., Wang, W. and Zhong, Y. (2018). An $\ell _{\infty}$ eigenvector perturbation bound and its application. J. Mach. Learn. Res. 18 1–42.

22.

Ford, L. R. Jr. (1957). Solution of a ranking problem from binary comparisons. Amer. Math. Monthly 64 28–33.Ford, L. R. Jr. (1957). Solution of a ranking problem from binary comparisons. Amer. Math. Monthly 64 28–33.

23.

Hajek, B., Oh, S. and Xu, J. (2014). Minimax-optimal inference from partial rankings. In Neural Information Processing Systems 1475–1483.Hajek, B., Oh, S. and Xu, J. (2014). Minimax-optimal inference from partial rankings. In Neural Information Processing Systems 1475–1483.

24.

Heckel, R., Shah, N. B., Ramchandran, K. and Wainwright, M. J. (2016). Active ranking from pairwise comparisons and when parametric assumptions don’t help. ArXiv preprint. Available at  ArXiv:1606.088421606.08842Heckel, R., Shah, N. B., Ramchandran, K. and Wainwright, M. J. (2016). Active ranking from pairwise comparisons and when parametric assumptions don’t help. ArXiv preprint. Available at  ArXiv:1606.088421606.08842

25.

Hunter, D. R. (2004). MM algorithms for generalized Bradley–Terry models. Ann. Statist. 32 384–406. 1105.62359 10.1214/aos/1079120141 euclid.aos/1079120141Hunter, D. R. (2004). MM algorithms for generalized Bradley–Terry models. Ann. Statist. 32 384–406. 1105.62359 10.1214/aos/1079120141 euclid.aos/1079120141

26.

Jamieson, K. G. and Nowak, R. D. (2011). Active ranking using pairwise comparisons. In Neural Information Processing Systems 2240–2248.Jamieson, K. G. and Nowak, R. D. (2011). Active ranking using pairwise comparisons. In Neural Information Processing Systems 2240–2248.

27.

Jang, M., Kim, S., Suh, C. and Oh, S. (2016). Top-$K$ ranking from pairwise comparisons: When spectral ranking is optimal. ArXiv preprint. Available at  arXiv:1603.041531603.04153Jang, M., Kim, S., Suh, C. and Oh, S. (2016). Top-$K$ ranking from pairwise comparisons: When spectral ranking is optimal. ArXiv preprint. Available at  arXiv:1603.041531603.04153

28.

Javanmard, A. and Montanari, A. (2018). Debiasing the lasso: Optimal sample size for Gaussian designs. Ann. Statist. 46 2593–2622. 06968593 10.1214/17-AOS1630 euclid.aos/1536307227Javanmard, A. and Montanari, A. (2018). Debiasing the lasso: Optimal sample size for Gaussian designs. Ann. Statist. 46 2593–2622. 06968593 10.1214/17-AOS1630 euclid.aos/1536307227

29.

Jiang, X., Lim, L.-H., Yao, Y. and Ye, Y. (2011). Statistical ranking and combinatorial Hodge theory. Math. Program. 127 203–244. 1210.90142 10.1007/s10107-010-0419-xJiang, X., Lim, L.-H., Yao, Y. and Ye, Y. (2011). Statistical ranking and combinatorial Hodge theory. Math. Program. 127 203–244. 1210.90142 10.1007/s10107-010-0419-x

30.

Keshavan, R. H., Montanari, A. and Oh, S. (2010). Matrix completion from noisy entries. J. Mach. Learn. Res. 11 2057–2078. 1242.62069Keshavan, R. H., Montanari, A. and Oh, S. (2010). Matrix completion from noisy entries. J. Mach. Learn. Res. 11 2057–2078. 1242.62069

31.

Koltchinskii, V. and Lounici, K. (2016). Asymptotics and concentration bounds for bilinear forms of spectral projectors of sample covariance. Ann. Inst. Henri Poincaré Probab. Stat. 52 1976–2013. 1353.62053 10.1214/15-AIHP705 euclid.aihp/1479373255Koltchinskii, V. and Lounici, K. (2016). Asymptotics and concentration bounds for bilinear forms of spectral projectors of sample covariance. Ann. Inst. Henri Poincaré Probab. Stat. 52 1976–2013. 1353.62053 10.1214/15-AIHP705 euclid.aihp/1479373255

32.

Koltchinskii, V. and Xia, D. (2016). Perturbation of linear forms of singular vectors under Gaussian noise. In High Dimensional Probability VII. Progress in Probability 71 397–423. Springer, Cham. 1353.15034Koltchinskii, V. and Xia, D. (2016). Perturbation of linear forms of singular vectors under Gaussian noise. In High Dimensional Probability VII. Progress in Probability 71 397–423. Springer, Cham. 1353.15034

33.

Lu, Y. and Negahban, S. N. (2014). Individualized rank aggregation using nuclear norm regularization. ArXiv preprint. Available at  ArXiv:1410.08601410.0860Lu, Y. and Negahban, S. N. (2014). Individualized rank aggregation using nuclear norm regularization. ArXiv preprint. Available at  ArXiv:1410.08601410.0860

34.

Luce, R. D. (1959). Individual Choice Behavior: A Theoretical Analysis. Wiley, New York; Chapman & Hall, London. 0093.31708Luce, R. D. (1959). Individual Choice Behavior: A Theoretical Analysis. Wiley, New York; Chapman & Hall, London. 0093.31708

35.

Ma, C., Wang, K., Chi, Y. and Chen, Y. (2017). Implicit regularization in nonconvex statistical estimation: Gradient descent converges linearly for phase retrieval, matrix completion and blind deconvolution. ArXiv preprint. Available at  ArXiv:1711.104671711.10467Ma, C., Wang, K., Chi, Y. and Chen, Y. (2017). Implicit regularization in nonconvex statistical estimation: Gradient descent converges linearly for phase retrieval, matrix completion and blind deconvolution. ArXiv preprint. Available at  ArXiv:1711.104671711.10467

36.

Masse, K. (1997). Statistical models applied to the rating of sports teams. Technical Report, Bluefield College, Bluefield, VA.Masse, K. (1997). Statistical models applied to the rating of sports teams. Technical Report, Bluefield College, Bluefield, VA.

37.

Negahban, S., Oh, S. and Shah, D. (2017). Rank centrality: Ranking from pairwise comparisons. Oper. Res. 65 266–287. 06725567 10.1287/opre.2016.1534Negahban, S., Oh, S. and Shah, D. (2017). Rank centrality: Ranking from pairwise comparisons. Oper. Res. 65 266–287. 06725567 10.1287/opre.2016.1534

38.

Negahban, S., Oh, S., Thekumparampil, K. K. and Xu, J. (2017). Learning from comparisons and choices. ArXiv preprint. Available at  ArXiv:1704.072281704.07228 06982331Negahban, S., Oh, S., Thekumparampil, K. K. and Xu, J. (2017). Learning from comparisons and choices. ArXiv preprint. Available at  ArXiv:1704.072281704.07228 06982331

39.

Pananjady, A., Mao, C., Muthukumar, V., Wainwright, M. J. and Courtade, T. A. (2017). Worst-case vs average-case design for estimation from fixed pairwise comparisons. ArXiv preprint. Available at  ArXiv:1707.062171707.06217Pananjady, A., Mao, C., Muthukumar, V., Wainwright, M. J. and Courtade, T. A. (2017). Worst-case vs average-case design for estimation from fixed pairwise comparisons. ArXiv preprint. Available at  ArXiv:1707.062171707.06217

40.

Rajkumar, A. and Agarwal, S. (2014). A statistical convergence perspective of algorithms for rank aggregation from pairwise data. In International Conference on Machine Learning I-118–I-126.Rajkumar, A. and Agarwal, S. (2014). A statistical convergence perspective of algorithms for rank aggregation from pairwise data. In International Conference on Machine Learning I-118–I-126.

41.

Rajkumar, A. and Agarwal, S. (2016). When can we rank well from comparisons of $O(n\log n)$ non-actively chosen pairs? In Conference on Learning Theory 1376–1401.Rajkumar, A. and Agarwal, S. (2016). When can we rank well from comparisons of $O(n\log n)$ non-actively chosen pairs? In Conference on Learning Theory 1376–1401.

42.

Rohe, K., Chatterjee, S. and Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist. 39 1878–1915. 1227.62042 10.1214/11-AOS887 euclid.aos/1314190618Rohe, K., Chatterjee, S. and Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist. 39 1878–1915. 1227.62042 10.1214/11-AOS887 euclid.aos/1314190618

43.

Shah, N. B. and Wainwright, M. J. (2015). Simple, robust and optimal ranking from pairwise comparisons. ArXiv preprint. Available at  arXiv:1512.089491512.08949 06982955Shah, N. B. and Wainwright, M. J. (2015). Simple, robust and optimal ranking from pairwise comparisons. ArXiv preprint. Available at  arXiv:1512.089491512.08949 06982955

44.

Shah, N. B., Balakrishnan, S., Guntuboyina, A. and Wainwright, M. J. (2017). Stochastically transitive models for pairwise comparisons: Statistical and computational issues. IEEE Trans. Inform. Theory 63 934–959. 1364.94253 10.1109/TIT.2016.2634418Shah, N. B., Balakrishnan, S., Guntuboyina, A. and Wainwright, M. J. (2017). Stochastically transitive models for pairwise comparisons: Statistical and computational issues. IEEE Trans. Inform. Theory 63 934–959. 1364.94253 10.1109/TIT.2016.2634418

45.

Soufiani, H. A., Chen, W. Z., Parkes, D. C. and Xia, L. (2013). Generalized method-of-moments for rank aggregation. In Proceedings of the 26th International Conference on Neural Information Processing Systems. NIPS’13 2706–2714.Soufiani, H. A., Chen, W. Z., Parkes, D. C. and Xia, L. (2013). Generalized method-of-moments for rank aggregation. In Proceedings of the 26th International Conference on Neural Information Processing Systems. NIPS’13 2706–2714.

46.

Suh, C., Tan, V. Y. F. and Zhao, R. (2017). Adversarial top-$K$ ranking. IEEE Trans. Inform. Theory 63 2201–2225. 1366.94173 10.1109/TIT.2017.2659660Suh, C., Tan, V. Y. F. and Zhao, R. (2017). Adversarial top-$K$ ranking. IEEE Trans. Inform. Theory 63 2201–2225. 1366.94173 10.1109/TIT.2017.2659660

47.

Sur, P., Chen, Y. and Candès, E. J. (2017). The likelihood ratio test in high-dimensional logistic regression is asymptotically a rescaled Chi-square. ArXiv preprint. Available at  ArXiv:1706.011911706.01191Sur, P., Chen, Y. and Candès, E. J. (2017). The likelihood ratio test in high-dimensional logistic regression is asymptotically a rescaled Chi-square. ArXiv preprint. Available at  ArXiv:1706.011911706.01191

48.

Tropp, J. A. (2015). An introduction to matrix concentration inequalities. Found. Trends Mach. Learn. 8 1–230. 1391.15071 10.1561/2200000048Tropp, J. A. (2015). An introduction to matrix concentration inequalities. Found. Trends Mach. Learn. 8 1–230. 1391.15071 10.1561/2200000048

49.

Zhong, Y. and Boumal, N. (2017). Near-optimal bounds for phase synchronization. Available at  arXiv:1703.066051703.06605 1396.90068 10.1137/17M1122025Zhong, Y. and Boumal, N. (2017). Near-optimal bounds for phase synchronization. Available at  arXiv:1703.066051703.06605 1396.90068 10.1137/17M1122025
Copyright © 2019 Institute of Mathematical Statistics
Yuxin Chen, Jianqing Fan, Cong Ma, and Kaizheng Wang "Spectral method and regularized MLE are both optimal for top-$K$ ranking," The Annals of Statistics 47(4), 2204-2235, (August 2019). https://doi.org/10.1214/18-AOS1745
Received: 1 August 2017; Published: August 2019
Vol.47 • No. 4 • August 2019
Back to Top