Annals of Applied Statistics

The ranking lasso and its application to sport tournaments

Guido Masarotto and Cristiano Varin

Full-text: Open access


Ranking a vector of alternatives on the basis of a series of paired comparisons is a relevant topic in many instances. A popular example is ranking contestants in sport tournaments. To this purpose, paired comparison models such as the Bradley–Terry model are often used. This paper suggests fitting paired comparison models with a lasso-type procedure that forces contestants with similar abilities to be classified into the same group. Benefits of the proposed method are easier interpretation of rankings and a significant improvement of the quality of predictions with respect to the standard maximum likelihood fitting. Numerical aspects of the proposed method are discussed in detail. The methodology is illustrated through ranking of the teams of the National Football League 2010–2011 and the American College Hockey Men’s Division I 2009–2010.

Article information

Ann. Appl. Stat., Volume 6, Number 4 (2012), 1949-1970.

First available in Project Euclid: 27 December 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bradley–Terry model clustering paired comparisons ranking sport tournaments


Masarotto, Guido; Varin, Cristiano. The ranking lasso and its application to sport tournaments. Ann. Appl. Stat. 6 (2012), no. 4, 1949--1970. doi:10.1214/12-AOAS581.

Export citation


  • Agresti, A. (2010). Analysis of Ordinal Categorical Data, 2nd ed. Wiley, Hoboken, NJ.
  • Böckenholt, U. (2006). Thurstonian-based analyses: Past, present, and future utilities. Psychometrika 71 615–629.
  • Bondell, H. D. and Reich, B. J. (2009). Simultaneous factor selection and collapsing levels in ANOVA. Biometrics 65 169–177.
  • Bradley, R. A. and Terry, M. E. (1952). Rank analysis of incomplete block designs. I. The method of paired comparisons. Biometrika 39 324–345.
  • Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when $p$ is much larger than $n$. Ann. Statist. 35 2313–2351.
  • Cattelan, M. (2012). Models for paired comparison data: A review with emphasis on dependent data. Statistical Science 27 412–433.
  • Chatterjee, A. and Lahiri, S. N. (2011). Bootstrapping lasso estimators. J. Amer. Statist. Assoc. 106 608–625.
  • Chen, J. and Chen, Z. (2008). Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95 759–771.
  • Donoho, D. L. (1995). De-noising by soft-thresholding. IEEE Trans. Inform. Theory 41 613–627.
  • Efron, B. (1987). Better bootstrap confidence intervals. J. Amer. Statist. Assoc. 82 171–185.
  • Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407–499.
  • Fahrmeir, L. and Tutz, G. (1994). Dynamic stochastic models for time-dependent ordered paired comparison systems. J. Amer. Statist. Assoc. 89 1438–1449.
  • Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
  • Gertheiss, J. and Tutz, G. (2010). Sparse modeling of categorial explanatory variables. Ann. Appl. Stat. 4 2150–2180.
  • Glickman, M. E. (1999). Parameter estimation in large dynamic paired comparison experiments. Applied Statistics 48 377–394.
  • Glickman, M. E. (2001). Dynamic paired comparison models with stochastic variances. J. Appl. Stat. 28 673–689.
  • Guo, J., Levina, E., Michailidis, G. and Zhu, J. (2010). Pairwise variable selection for high-dimensional model-based clustering. Biometrics 66 793–804.
  • Hestenes, M. R. (1969). Multiplier and gradient methods. J. Optim. Theory Appl. 4 303–320.
  • Joe, H. (1990). Extended use of paired comparison models, with application to chess rankings. J. Roy. Statist. Soc. Ser. C 39 85–93.
  • Knorr-Held, L. (2000). Dynamic rating of sports teams. The Statistician 49 261–276.
  • Lian, H. (2010). A simple and efficient algorithm for fused lasso signal approximator with convex loss function. Available at arXiv:1005.5085.
  • Mease, D. (2003). A penalized maximum likelihood approach for the ranking of college football teams independent of victory margins. Amer. Statist. 57 241–248.
  • Nocedal, J. and Wright, S. J. (2006). Numerical Optimization, 2nd ed. Springer, New York.
  • Powell, M. J. D. (1969). A method for nonlinear constraints in minimization problems. In Optimization (Sympos., Univ. Keele, Keele, 1968) 283–298. Academic Press, London.
  • R Development Core Team. (2012). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. Available at
  • She, Y. (2010). Sparse regression with exact clustering. Electron. J. Stat. 4 1055–1096.
  • Stern, H. S. (2004). Statistics and the college football championship. Amer. Statist. 58 179–195.
  • Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review 79 281–299.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
  • Tibshirani, R. J. and Taylor, J. (2011). The solution path of the generalized lasso. Ann. Statist. 39 1335–1371.
  • Tibshirani, R., Saunders, M., Rosset, S., Zhu, J. and Knight, K. (2005). Sparsity and smoothness via the fused lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 91–108.
  • Turner, H. and Firth, D. (2012). Bradley–Terry models in R: The BradleyTerry2 package. Journal of Statistical Software 48 1–21.
  • Ye, G.-B. and Xie, X. (2011). Split Bregman method for large scale fused Lasso. Comput. Statist. Data Anal. 55 1552–1569.
  • Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418–1429.