The Annals of Applied Statistics

Bayesian anomaly detection methods for social networks

Nicholas A. Heard, David J. Weston, Kiriaki Platanioti, and David J. Hand

Full-text: Open access

Abstract

Learning the network structure of a large graph is computationally demanding, and dynamically monitoring the network over time for any changes in structure threatens to be more challenging still.

This paper presents a two-stage method for anomaly detection in dynamic graphs: the first stage uses simple, conjugate Bayesian models for discrete time counting processes to track the pairwise links of all nodes in the graph to assess normality of behavior; the second stage applies standard network inference tools on a greatly reduced subset of potentially anomalous nodes. The utility of the method is demonstrated on simulated and real data sets.

Article information

Source
Ann. Appl. Stat., Volume 4, Number 2 (2010), 645-662.

Dates
First available in Project Euclid: 3 August 2010

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1280842134

Digital Object Identifier
doi:10.1214/10-AOAS329

Mathematical Reviews number (MathSciNet)
MR2758643

Zentralblatt MATH identifier
1194.62021

Keywords
Dynamic networks Bayesian inference counting processes hurdle models

Citation

Heard, Nicholas A.; Weston, David J.; Platanioti, Kiriaki; Hand, David J. Bayesian anomaly detection methods for social networks. Ann. Appl. Stat. 4 (2010), no. 2, 645--662. doi:10.1214/10-AOAS329. https://projecteuclid.org/euclid.aoas/1280842134


Export citation

References

  • Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. Wiley, Chichester.
  • Brin, S. and Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30 107–117.
  • Faloutsos, C., McCurley, K. S. and Tomkins, A. (2004). Connection subgraphs in social networks. In Proceeding of SIAM International Conference on Data Mining, SIAM Workshop on Link Analysis, Counterterrorism and Security. SIAM, Philadelphia.
  • Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209–230.
  • Heard, N. A., Weston, D. J., Platanioti, K. and Hand, D. J. (2010). Supplement to “Bayesian anomaly detection methods for social networks.” DOI:10.1214/10-AOAS329SUPPA, DOI:10.1214/10-AOAS329SUPPB.
  • Mullahy, J. (1986). Specification and testing of some modified count data models. J. Econometrics 33 341–365.
  • Pan, J.-Y., Yang, H.-J., Faloutsos, C. and Duygulu, P. (2004). Automatic multimedia cross-modal correlation discovery. In KDD’04: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 653–658. ACM, New York.
  • Priebe, C. E., Conroy, J. M., Marchette, D. J. and Park, Y. (2005). Scan statistics on Enron graphs. Computational & Mathematical Organization Theory 11 229–247.
  • Tong, H., Faloutsos, C. and Pan, J.-Y. (2006). Fast random walk with restart and its applications. In ICDM’06: Sixth IEEE International Conference on Data Mining 613–622. IEEE Computer Society, Washington, DC.
  • von Luxburg, U. (2007). A tutorial on spectral clustering. Statist. Comput. 17 395–416.
  • Wasserman, S. and Pattison, P. (1996). Logit models and logistic regressions for social networks: I. An introduction to Markov graphs and p. Psychometrika 61 401–425.
  • Ye, Q., Zhu, T., Hu, D., Wu, B., Du, N. and Wang, B. (2008). Cell phone mini challenge award: Social network accuracy—exploring temporal communication in mobile call graphs. In IEEE International Symposium on Visual Analytics Science and Technology 207–208. IEEE, Piscataway, NJ.

Supplemental materials