## The Annals of Statistics

### Universally consistent vertex classification for latent positions graphs

#### Abstract

In this work we show that, using the eigen-decomposition of the adjacency matrix, we can consistently estimate feature maps for latent position graphs with positive definite link function $\kappa$, provided that the latent positions are i.i.d. from some distribution $F$. We then consider the exploitation task of vertex classification where the link function $\kappa$ belongs to the class of universal kernels and class labels are observed for a number of vertices tending to infinity and that the remaining vertices are to be classified. We show that minimization of the empirical $\varphi$-risk for some convex surrogate $\varphi$ of 0–1 loss over a class of linear classifiers with increasing complexities yields a universally consistent classifier, that is, a classification rule with error converging to Bayes optimal for any distribution $F$.

#### Article information

Source
Ann. Statist., Volume 41, Number 3 (2013), 1406-1430.

Dates
First available in Project Euclid: 1 August 2013

https://projecteuclid.org/euclid.aos/1375362554

Digital Object Identifier
doi:10.1214/13-AOS1112

Mathematical Reviews number (MathSciNet)
MR3113816

Zentralblatt MATH identifier
1273.62147

#### Citation

Tang, Minh; Sussman, Daniel L.; Priebe, Carey E. Universally consistent vertex classification for latent positions graphs. Ann. Statist. 41 (2013), no. 3, 1406--1430. doi:10.1214/13-AOS1112. https://projecteuclid.org/euclid.aos/1375362554

#### References

• [1] Airoldi, E. M., Blei, D. M., Fienberg, S. E. and Xing, E. P. (2008). Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9 1981–2014.
• [2] Bartlett, P. L., Jordan, M. I. and McAuliffe, J. D. (2006). Convexity, classification, and risk bounds. J. Amer. Statist. Assoc. 101 138–156.
• [3] Belkin, M. and Niyogi, P. (2005). Towards a theoretical foundation for Laplacian-based manifold methods. In Proceedings of the 18th Conference on Learning Theory (COLT 2005) 486–500. Springer, Berlin.
• [4] Bengio, Y., Vincent, P., Paiement, J. F., Delalleau, O., Ouimet, M. and Roux, N. L. (2004). Learning eigenfunctions links spectral embedding and kernel PCA. Neural Comput. 16 2197–2219.
• [5] Biau, G., Bunea, F. and Wegkamp, M. H. (2005). Functional classification in Hilbert spaces. IEEE Trans. Inform. Theory 51 2163–2172.
• [6] Blanchard, G., Bousquet, O. and Massart, P. (2008). Statistical performance of support vector machines. Ann. Statist. 36 489–531.
• [7] Blanchard, G., Bousquet, O. and Zwald, L. (2007). Statistical properties of kernel principal component analysis. Machine Learning 66 259–294.
• [8] Bollobás, B., Janson, S. and Riordan, O. (2007). The phase transition in inhomogeneous random graphs. Random Structures Algorithms 31 3–122.
• [9] Braun, M. L. (2006). Accurate error bounds for the eigenvalues of the kernel matrix. J. Mach. Learn. Res. 7 2303–2328.
• [10] Chatterjee, S. (2012). Matrix estimation by universal singular value thresholding. Preprint. Available at http://arxiv.org/abs/1212.1247.
• [11] Chaudhuri, K., Chung, F. and Tsiatas, A. (2012). Spectral partitioning of graphs with general degrees and the extended planted partition model. In Proceedings of the 25th Conference on Learning Theory. Springer, Berlin.
• [12] Cucker, F. and Smale, S. (2002). On the mathematical foundations of learning. Bull. Amer. Math. Soc. (N.S.) 39 1–49 (electronic).
• [13] Davis, C. and Kahan, W. M. (1970). The rotation of eigenvectors by a perturbation. III. SIAM J. Numer. Anal. 7 1–46.
• [14] Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Applications of Mathematics (New York) 31. Springer, New York.
• [15] Diaconis, P. and Janson, S. (2008). Graph limits and exchangeable random graphs. Rend. Mat. Appl. (7) 28 33–61.
• [16] Fishkind, D. E., Sussman, D. L., Tang, M., Vogelstein, J. T. and Priebe, C. E. (2013). Consistent adjacency-spectral partitioning for the stochastic block model when the model parameters are unknown. SIAM J. Matrix Anal. Appl. 34 23–39.
• [17] Hein, M., Audibert, J.-Y. and von Luxburg, U. (2005). From graphs to manifolds—Weak and strong pointwise consistency of graph Laplacians. In Proceedings of the 18th Conference on Learning Theory (COLT 2005) 470–485. Springer, Berlin.
• [18] Hein, M., Audibert, J. Y. and von Luxburg, U. (2007). Convergence of graph Laplacians on random neighbourhood graphs. J. Mach. Learn. Res. 8 1325–1370.
• [19] Hoff, P. D., Raftery, A. E. and Handcock, M. S. (2002). Latent space approaches to social network analysis. J. Amer. Statist. Assoc. 97 1090–1098.
• [20] Holland, P. W., Laskey, K. B. and Leinhardt, S. (1983). Stochastic blockmodels: First steps. Social Networks 5 109–137.
• [21] Karrer, B. and Newman, M. E. J. (2011). Stochastic blockmodels and community structure in networks. Phys. Rev. E (3) 83 016107, 10.
• [22] Kato, T. (1987). Variation of discrete spectra. Comm. Math. Phys. 111 501–504.
• [23] Koltchinskii, V. and Giné, E. (2000). Random matrix approximation of spectra of integral operators. Bernoulli 6 113–167.
• [24] Lugosi, G. and Vayatis, N. (2004). On the Bayes-risk consistency of regularized boosting methods. Ann. Statist. 32 30–55.
• [25] Micchelli, C. A., Xu, Y. and Zhang, H. (2006). Universal kernels. J. Mach. Learn. Res. 7 2651–2667.
• [26] Oliveira, R. I. (2010). Concentration of the adjacency matrix and of the Laplacian in random graphs with independent edges. Preprint. Available at http://arxiv.org/abs/0911.0600.
• [27] Rohe, K., Chatterjee, S. and Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist. 39 1878–1915.
• [28] Rosasco, L., Belkin, M. and De Vito, E. (2010). On learning with integral operators. J. Mach. Learn. Res. 11 905–934.
• [29] Shawe-Taylor, J., Williams, C. K. I., Cristianini, N. and Kandola, J. (2005). On the eigenspectrum of the Gram matrix and the generalization error of kernel-PCA. IEEE Trans. Inform. Theory 51 2510–2522.
• [30] Sriperumbudur, B. K., Fukumizu, K. and Lanckriet, G. R. G. (2011). Universality, characteristic kernels and RKHS embedding of measures. J. Mach. Learn. Res. 12 2389–2410.
• [31] Steinwart, I. (2002). On the influence of the kernel on the consistency of support vector machines. J. Mach. Learn. Res. 2 67–93.
• [32] Steinwart, I. (2002). Support vector machines are universally consistent. J. Complexity 18 768–791.
• [33] Steinwart, I. and Scovel, C. (2012). Mercer’s theorem on general domains: On the interaction between measures, kernels, and RKHSs. Constr. Approx. 35 363–417.
• [34] Stone, C. J. (1977). Consistent nonparametric regression. Ann. Statist. 5 595–620.
• [35] Sussman, D. L., Tang, M., Fishkind, D. E. and Priebe, C. E. (2012). A consistent adjacency spectral embedding for stochastic blockmodel graphs. J. Amer. Statist. Assoc. 107 1119–1128.
• [36] Sussman, D. L., Tang, M. and Priebe, C. E. (2012). Consistent latent position estimation and vertex classification for random dot product graphs. Preprint. Available at http://arxiv.org/abs/1207.6745.
• [37] Tropp, J. A. (2011). Freedman’s inequality for matrix martingales. Electron. Commun. Probab. 16 262–270.
• [38] von Luxburg, U., Belkin, M. and Bousquet, O. (2008). Consistency of spectral clustering. Ann. Statist. 36 555–586.
• [39] Young, S. J. and Scheinerman, E. R. (2007). Random dot product graph models for social networks. In Algorithms and Models for the Web-Graph. Lecture Notes in Computer Science 4863 138–149. Springer, Berlin.
• [40] Zhang, T. (2004). Statistical behavior and consistency of classification methods based on convex risk minimization. Ann. Statist. 32 56–85.
• [41] Zwald, L. and Blanchard, G. (2006). On the convergence of eigenspaces in kernel principal components analysis. In Advances in Neural Information Processing Systems (NIPS 05) 18 1649–1656. MIT Press, Cambridge.