The Annals of Statistics
- Ann. Statist.
- Volume 36, Number 2 (2008), 844-874.
Ranking and Empirical Minimization of U-statistics
Stéphan Clémençon, Gábor Lugosi, and Nicolas Vayatis
Full-text: Access has been disabled (more information)
Abstract
The problem of ranking/ordering instances, instead of simply classifying them, has recently gained much attention in machine learning. In this paper we formulate the ranking problem in a rigorous statistical framework. The goal is to learn a ranking rule for deciding, among two instances, which one is “better,” with minimum ranking risk. Since the natural estimates of the risk are of the form of a U-statistic, results of the theory of U-processes are required for investigating the consistency of empirical risk minimizers. We establish, in particular, a tail inequality for degenerate U-processes, and apply it for showing that fast rates of convergence may be achieved under specific noise assumptions, just like in classification. Convex risk minimization methods are also studied.
Article information
Source
Ann. Statist. Volume 36, Number 2 (2008), 844-874.
Dates
First available in Project Euclid: 13 March 2008
Permanent link to this document
http://projecteuclid.org/euclid.aos/1205420521
Digital Object Identifier
doi:10.1214/009052607000000910
Mathematical Reviews number (MathSciNet)
MR2396817
Zentralblatt MATH identifier
1181.68160
Subjects
Primary: 68Q32: Computational learning theory [See also 68T05] 60E15: Inequalities; stochastic orderings 60C05: Combinatorial probability 60G25: Prediction theory [See also 62M20]
Keywords
Statistical learning theory of classification VC classes fast rates convex risk minimization moment inequalities U-processes
Citation
Clémençon, Stéphan; Lugosi, Gábor; Vayatis, Nicolas. Ranking and Empirical Minimization of U -statistics. Ann. Statist. 36 (2008), no. 2, 844--874. doi:10.1214/009052607000000910. http://projecteuclid.org/euclid.aos/1205420521.
References
- Adamczak, R. (2007). Moment inequalities for U-statistics. Ann. Probab. 34 2288–2314.Mathematical Reviews (MathSciNet): MR2294982
Digital Object Identifier: doi:10.1214/009117906000000476
Project Euclid: euclid.aop/1171377443
Zentralblatt MATH: 1123.60009 - Agarwal, S., Graepel, T., Herbrich, R., Har-Peled, S. and Roth, D. (2005). Generalization bounds for the area under the ROC curve. J. Machine Learning Research 6 393–425.Mathematical Reviews (MathSciNet): MR2249826
- Arcones, M. A. and Giné, E. (1993). Limit theorems for U-processes. Ann. Probab 21 1494–1542.Mathematical Reviews (MathSciNet): MR1235426
Digital Object Identifier: doi:10.1214/aop/1176989128
Project Euclid: euclid.aop/1176989128
Zentralblatt MATH: 0789.60031 - Arcones, M. A. and Giné, E. (1994). U-processes indexed by Vapnik–Cervonenkis classes of functions with applications to asymptotics and bootstrap of U-statistics with estimated parameters. Stochastic Process. Appl. 52 17–38.Mathematical Reviews (MathSciNet): MR1289166
Digital Object Identifier: doi:10.1016/0304-4149(94)90098-1
Zentralblatt MATH: 0807.62014 - Bartlett, P. L., Jordan, M. I. and McAuliffe, J. D. (2006). Convexity, classification, and risk bounds. J. Amer. Statist. Assoc. 101 138–156.Mathematical Reviews (MathSciNet): MR2268032
Digital Object Identifier: doi:10.1198/016214505000000907
Zentralblatt MATH: 1118.62330 - Bartlett, P. L. and Mendelson, S. (2006). Empirical minimization. Probab. Theory Related Fields 135 311–334.Mathematical Reviews (MathSciNet): MR2240689
Digital Object Identifier: doi:10.1007/s00440-005-0462-3
Zentralblatt MATH: 1142.62348 - Blanchard, G., Lugosi, G. and Vayatis, N. (2003). On the rates of convergence of regularized boosting classifiers. J. Machine Learning Research 4 861–894.
- Boucheron, S., Bousquet, O. and Lugosi, G. (2005). Theory of classification: A survey of some recent advances. ESAIM Probab. Statist. 9 323–375.Mathematical Reviews (MathSciNet): MR2182250
Digital Object Identifier: doi:10.1051/ps:2005018
Zentralblatt MATH: 1136.62355 - Boucheron, S., Bousquet, O., Lugosi, G. and Massart, P. (2005). Moment inequalities for functions of independent random variables. Ann. Probab. 33 514–560.Mathematical Reviews (MathSciNet): MR2123200
Digital Object Identifier: doi:10.1214/009117904000000856
Project Euclid: euclid.aop/1109868590
Zentralblatt MATH: 1074.60018 - Breiman, L. (2004). Population theory for boosting ensembles. Ann. Statist. 32 1–11.Mathematical Reviews (MathSciNet): MR2050998
Digital Object Identifier: doi:10.1214/aos/1079120126
Project Euclid: euclid.aos/1079120126
Zentralblatt MATH: 1105.62308 - Cao, Y., Xu, J., Liu, T. Y., Li, H., Huang, Y. and Hon, H. W. (2006). Adapting ranking SVM to document retrieval. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 186–193. ACM Press, Seattle, WA.
- Cortes, C. and Mohri, M. (2004). AUC optimization vs. error rate minimization. In Advances in Neural Information Processing Systems 16 (S. Thrun, L. Saul and B. Schölkopf, eds.) 313–320. MIT Press.
- Cossock, D. and Zhang, T. (2006). Subset ranking uning regression. Proceedings of the 19th Annual Conference on Learning Theory COLT 2006 (G. Lugosi and H.U. Simon, eds.) 605–619. Lecture Notes in Comput. Sci. 4005. Springer, Berlin.Mathematical Reviews (MathSciNet): MR2280634
Digital Object Identifier: doi:10.1007/11776420_44
Zentralblatt MATH: 1143.68527 - Cucker, F. and Smale, S. (2002). On the mathematical foundations of learning. Bull. Amer. Math. Soc. 39 1–49.Mathematical Reviews (MathSciNet): MR1864085
Digital Object Identifier: doi:10.1090/S0273-0979-01-00923-5
Zentralblatt MATH: 0983.68162 - de la Peña, V. H. and Giné, E. (1999). Decoupling: From Dependence to Independence. Springer, New York.
- Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer, New York.
- Freund, Y., Iyer, R., Schapire, R. E. and Singer, Y. (2004). An efficient boosting algorithm for combining preferences. J. Machine Learning Research 4 933–969.Mathematical Reviews (MathSciNet): MR2125342
Digital Object Identifier: doi:10.1162/1532443041827916
Zentralblatt MATH: 1098.68652 - Giné, E., Latała, R. and Zinn, J. (2000). Exponential and moment inequalities for U-statistics. In High Dimensional Probability II. Progress Probab. 47 13–38. Birkhäuser, Boston.Mathematical Reviews (MathSciNet): MR1857312
- Giné, E. and Zinn, J. (1984). Some limit theorems for empirical processes. Ann. Probab. 12 929–989.Mathematical Reviews (MathSciNet): MR757767
Digital Object Identifier: doi:10.1214/aop/1176993138
Project Euclid: euclid.aop/1176993138 - Green, D. M. and Swets, J. A. (1966). Signal Detection Theory and Psychophysics. Wiley, New York.
- Haussler, D. (1995). Sphere packing numbers for subsets of the Boolean n-cube with bounded Vapnik–Chervonenkis dimension. J. Combin. Theory Ser. A 69 217–232.Mathematical Reviews (MathSciNet): MR1313896
Digital Object Identifier: doi:10.1016/0097-3165(95)90052-7
Zentralblatt MATH: 0818.60005 - Hoeffding, W. (1948). A class of statistics with asymptotically normal distributions. Ann. Math. Statist. 19 293–325.Mathematical Reviews (MathSciNet): MR26294
Digital Object Identifier: doi:10.1214/aoms/1177730196
Project Euclid: euclid.aoms/1177730196 - Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58 13–30.Mathematical Reviews (MathSciNet): MR144363
Digital Object Identifier: doi:10.2307/2282952
JSTOR: links.jstor.org
Zentralblatt MATH: 0127.10602 - Houdré, C. and Reynaud-Bouret, P. (2003). Exponential inequalities, with constants, for U-statistics of order two. In Stochastic Inequalities and Applications. Progr. Probab. 56 55–69. Birkhäuser, Basel.Mathematical Reviews (MathSciNet): MR2073426
- Jiang, W. (2004). Process consistency for Adaboost (with discussion). Ann. Statist. 32 13–29.Mathematical Reviews (MathSciNet): MR2050999
Digital Object Identifier: doi:10.1214/aos/1079120128
Project Euclid: euclid.aos/1079120128
Zentralblatt MATH: 1105.62316 - Koltchinskii, V. (2006). Local Rademacher complexities and oracle inequalities in risk minimization (with discussion). Ann. Statist. 34 2593–2706.Mathematical Reviews (MathSciNet): MR2329442
Digital Object Identifier: doi:10.1214/009053606000001019
Project Euclid: euclid.aos/1179935055
Zentralblatt MATH: 1118.62065 - Koltchinskii, V. and Panchenko, D. (2002). Empirical margin distribution and bounding the generalization error of combined classifiers. Ann. Statist. 30 1–50.
- Ledoux, M. (1997). On Talagrand’s deviation inequalities for product measures. ESAIM Probab. Statist. 1 63–87.Mathematical Reviews (MathSciNet): MR1399224
Digital Object Identifier: doi:10.1051/ps:1997103
Zentralblatt MATH: 0869.60013 - Lugosi, G. (2002). Pattern classification and learning theory. In Principles of Nonparametric Learning (L. Györfi, ed.) 5–62. Springer, Wienna.Mathematical Reviews (MathSciNet): MR1987656
- Lugosi, G. and Vayatis, N. (2004). On the Bayes-risk consistency of regularized boosting methods (with discussion). Ann. Statist. 32 30–55.
- Major, P. (2006). An estimate of the supremum of a nice class of stochastic integrals and U-statistics. Probab. Theory Related Fields 134 489–537.Mathematical Reviews (MathSciNet): MR2226889
Digital Object Identifier: doi:10.1007/s00440-005-0440-9
Zentralblatt MATH: 1128.62063 - Massart, P. (2007). Concentration Inequalities and Model Selection. Springer, Berlin.Mathematical Reviews (MathSciNet): MR2319879
- Massart, P. and Nédélec, E. (2006). Risk bounds for statistical learning. Ann. Statist. 34 2326–2366.Mathematical Reviews (MathSciNet): MR2291502
Digital Object Identifier: doi:10.1214/009053606000000786
Project Euclid: euclid.aos/1169571799
Zentralblatt MATH: 1108.62007 - McDiarmid, C. (1989). On the method of bounded differences. In Surveys in Combinatorics 1989 148–188. Cambridge Univ. Press.
- Rudin, C. (2006). Ranking with a p-norm push. In Proceedings of COLT 2006 (P. Auer and R. Meir, eds.). Lecture Notes in Comput. Sci. 4005 589–604. Springer, Berlin.Mathematical Reviews (MathSciNet): MR2280633
Digital Object Identifier: doi:10.1007/11776420_43
Zentralblatt MATH: 1143.68559 - Scovel, S. and Steinwart, I. (2005). Fast rates for support vector machines. Learning Theory 279–294. Lecture Notes in Comput. Sci. 3559. Springer, Berlin.
- Serfling, R. J. (1980). Approximation Theorems of Mathematical Statistics. Wiley, New York.
- Smale, S. and Zhou, D. X. (2003). Estimating the approximation error in learning theory. Anal. Appl. 1 17–41.Mathematical Reviews (MathSciNet): MR1959283
Digital Object Identifier: doi:10.1142/S0219530503000089
Zentralblatt MATH: 1079.68089 - Steinwart, I. (2001). On the influence of the kernel on the consistency of support vector machines. J. Machine Learning Research 2 67–93.Mathematical Reviews (MathSciNet): MR1883281
Digital Object Identifier: doi:10.1162/153244302760185252
Zentralblatt MATH: 1009.68143 - Stute, W. (1991). Conditional U-statistics. Ann. Probab. 19 812–825.Mathematical Reviews (MathSciNet): MR1106287
Digital Object Identifier: doi:10.1214/aop/1176990452
Project Euclid: euclid.aop/1176990452
Zentralblatt MATH: 0770.60035 - Stute, W. (1994). Universally consistent conditional U-statistics. Ann. Statist. 22 460–473.Mathematical Reviews (MathSciNet): MR1272093
Digital Object Identifier: doi:10.1214/aos/1176325378
Project Euclid: euclid.aos/1176325378
Zentralblatt MATH: 0818.62049 - Talagrand, M. (1996). New concentration inequalities in product spaces. Invent. Math. 126 505–563.Mathematical Reviews (MathSciNet): MR1419006
Digital Object Identifier: doi:10.1007/s002220050108
Zentralblatt MATH: 0893.60001 - Tsybakov, A. B. (2004). Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32 135–166.Mathematical Reviews (MathSciNet): MR2051002
Digital Object Identifier: doi:10.1214/aos/1079120131
Project Euclid: euclid.aos/1079120131
Zentralblatt MATH: 1105.62353 - Usunier, N., Truong, V., Amini, M. and Gallinari, P. (2005). Ranking with unlabeled data: A first study. In Proceedings of NIPS’05 Workshop on Learning to Rank. Whistler, Canada.
- Vapnik, V. N. and Chervonenkis, A. Ya. (1974). Theory of Pattern Recognition. Nauka, Moscow. (In Russian.) [German translation Theorie der Zeichenerkennung (1979) Akademie Verlag, Berlin.]
- Vittaut, J. N. and Gallinari, P. (2006). Machine learning ranking for structured information retrieval. Advances in Information Retrieval. Lecture Notes in Comput. Sci. 3936 338–349. Springer, Berlin.
- Vu, H. T. and Gallinari, P. (2005). Using RankBoost to compare retrieval systems. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management, CIKM’05 309–310. ACM Press, New York.
- Zhang, T. (2004). Statistical behavior and consistency of classification methods based on convex risk minimization (with discussion). Ann. Statist. 32 56–85.Mathematical Reviews (MathSciNet): MR2051001
Digital Object Identifier: doi:10.1214/aos/1079120130
Project Euclid: euclid.aos/1079120130
Zentralblatt MATH: 1105.62323

- You have access to this content.
- You have partial access to this content.
- You do not have access to this content.
More like this
- Support Vector Machines with Applications
Moguerza, Javier M. and Muñoz, Alberto, Statistical Science, 2006 - Regularized Ranking with Convex Losses and
ℓ
1
-Penalty
Chen, Heng and Wu, Jitao, Abstract and Applied Analysis, 2013 - Support vector machines with a reject option
Wegkamp, Marten and Yuan, Ming, Bernoulli, 2011
- Support Vector Machines with Applications
Moguerza, Javier M. and Muñoz, Alberto, Statistical Science, 2006 - Regularized Ranking with Convex Losses and
ℓ
1
-Penalty
Chen, Heng and Wu, Jitao, Abstract and Applied Analysis, 2013 - Support vector machines with a reject option
Wegkamp, Marten and Yuan, Ming, Bernoulli, 2011 - Bounding the generalization error of convex combinations of classifiers: balancing the dimensionality and the margins
Koltchinskii, Vladimir, Panchenko, Dmitriy, and Lozano, Fernando, The Annals of Applied Probability, 2003 - Minimax fast rates for discriminant analysis with errors in variables
Loustau, Sébastien and Marteau, Clément, Bernoulli, 2015 - Penalized empirical risk minimization over Besov spaces
Loustau, Sébastien, Electronic Journal of Statistics, 2009 - Statistical performance of support vector machines
Blanchard, Gilles, Bousquet, Olivier, and Massart, Pascal, The Annals of Statistics, 2008 - The asymptotics of ranking algorithms
Duchi, John C., Mackey, Lester, and Jordan, Michael I., The Annals of Statistics, 2013 - A Mahalanobis Hyperellipsoidal Learning Machine Class Incremental Learning Algorithm
Qin, Yuping, Karimi, Hamid Reza, Li, Dan, Lun, Shuxian, and Zhang, Aihua, Abstract and Applied Analysis, 2014 - Square root penalty: Adaptation to the margin in classification and in edge estimation
Tsybakov, A. B. and van de Geer, S. A., The Annals of Statistics, 2005
