The Annals of Applied Statistics

Empirical stationary correlations for semi-supervised learning on graphs

Ya Xu, Justin S. Dyer, and Art B. Owen

Full-text: Open access


In semi-supervised learning on graphs, response variables observed at one node are used to estimate missing values at other nodes. The methods exploit correlations between nearby nodes in the graph. In this paper we prove that many such proposals are equivalent to kriging predictors based on a fixed covariance matrix driven by the link structure of the graph. We then propose a data-driven estimator of the correlation structure that exploits patterns among the observed response values. By incorporating even a small fraction of observed covariation into the predictions, we are able to obtain much improved prediction on two graph data sets.

Article information

Ann. Appl. Stat., Volume 4, Number 2 (2010), 589-614.

First available in Project Euclid: 3 August 2010

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Graph Laplacian kriging pagerank random walk


Xu, Ya; Dyer, Justin S.; Owen, Art B. Empirical stationary correlations for semi-supervised learning on graphs. Ann. Appl. Stat. 4 (2010), no. 2, 589--614. doi:10.1214/09-AOAS293.

Export citation


  • Belkin, M., Matveeva, I. and Niyogi, P. (2004). Regularization and Semi-Supervised Learning on large graphs. In Learning theory. Lecture Notes in Comput. Sci. 3120 624–638. Springer, Berlin. 3120.
  • Belkin, M., Niyogi, P. and Sindhwani, V. (2006). Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7 2399–2434.
  • Cressie, N. (1993). Statistics for Spatial Data. Wiley, New York.
  • Cristianini, N. and Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge Univ. Press, Cambridge.
  • Hall, P., Fisher, N. I. and Hoffmann, B. (1994). On the nonparametric estimation of covariance functions. Ann. Statist. 22 2115–2134.
  • Handcock, M. S., Raftery, A. E. and Tantrum, J. M. (2007). Model-based clustering for social networks. J. Roy. Statist. Soc. Ser. A 170 301–354.
  • Heaton, T. J. and Silverman, B. W. (2008). A wavelet- or lifting-scheme-based imputation method. J. Roy. Statist. Soc. Ser. B 70 567–587.
  • Hoff, P. D., Raftery, A. E. and Handcock, M. S. (2002). Latent space approaches to social network analysis. J. Amer. Statist. Assoc. 1090–1098.
  • Jansen, M., Nason, G. P. and Silverman, B. W. (2009). Multiscale methods for data on graphs and irregular multidimensional situations. J. Roy. Statist. Soc. Ser. B 71 97–125.
  • Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environment. J. ACM 46 604–632.
  • Kondor, R. I. and Lafferty, J. (2002). Diffusion kernels on graphs and other discrete structures. In Proceedings of the ICML (C. Sammut and A. Hoffmann, eds.) 315–322. Morgan Kaufmann, San Francisco, CA.
  • Krige, D. G. (1951). A statistical approach to some basic mine valuation problems on the Witwatersrand. Journal of the Chemical, Metallurgical and Mining Society of South Africa 52 119–139.
  • Leenders, R. T. A. J. (2002). Modeling social influence through network autocorrelation: Constructing the weight matrix. Social Networks 24 21–47.
  • Marsden, P. and Friedkin, N. (1993). Network studies of social influence. Sociological Methods and Research 22 127–151.
  • Page, L., Brin, S., Motwani, R. and Winograd, T. (1998). The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project.
  • Robinson, G. K. (1991). That BLUP is a good thing: The estimation of random effects. Statist. Sci. 6 15–32.
  • Shapiro, A. and Botha, J. D. (1991). Variogram fitting with a general class of conditionally nonnegative definite functions. Comput. Stat. Data Anal. 11 87–96.
  • Smola, A. J. and Kondor, I. R. (2003). Kernels and regularization on graphs. In Proceedings of the Annual Conference on Computational Learning Theory. Lecture Notes in Artificial Intelligence 2777. Springer, 2003.
  • Stein, M. L. (1999). Interpolation of Spatial Data: Some Theory for Kriging. Springer, New York.
  • von Luxborg, U. (2007). A tutorial on spectral clustering. Statist. Comput. 17 395–416.
  • Zhou, D., Bousquet, O., Lal, T., Weston, J. and Schölkopf, B. (2004). Learning with local and global consistency. In NIPS, 16 321–328. MIT Press, Cambridge, MA.
  • Zhou, D., Huang, J. and Schölkopf, B. (2005). Learning from labeled and unlabeled data on a directed graph. In The 22nd ICML (L. De Raedt and S. Wrobell, eds.) 1041–1048. ACM/ICML, New York.
  • Zhou, D., Schölkopf, B. and Hofmann, T. (2005). Semi-supervised learning on directed graphs. In NIPS 17 1633–1640, MIT Press, Cambridge, MA.
  • Zhu, X. (2005). Semi-supervised learning literature survey. Technical report 1530, Computer Sciences, Univ. Wisconsin-Madison.
  • Zhu, X., Ghahramani, Z. and Lafferty, J. (2003). Semi-supervised learning using Gaussian fields and harmonic functions. In ICML (T. Fawcett and N. Mishra, eds.) 912–919. AAAI Press, Menlo Park, CA.