The Annals of Applied Statistics

Estimating time-varying networks

Mladen Kolar, Le Song, Amr Ahmed, and Eric P. Xing

Full-text: Open access

Abstract

Stochastic networks are a plausible representation of the relational information among entities in dynamic systems such as living cells or social communities. While there is a rich literature in estimating a static or temporally invariant network from observation data, little has been done toward estimating time-varying networks from time series of entity attributes. In this paper we present two new machine learning methods for estimating time-varying networks, which both build on a temporally smoothed l1-regularized logistic regression formalism that can be cast as a standard convex-optimization problem and solved efficiently using generic solvers scalable to large networks. We report promising results on recovering simulated time-varying networks. For real data sets, we reverse engineer the latent sequence of temporally rewiring political networks between Senators from the US Senate voting records and the latent evolving regulatory networks underlying 588 genes across the life cycle of Drosophila melanogaster from the microarray time course.

Article information

Source
Ann. Appl. Stat. Volume 4, Number 1 (2010), 94-123.

Dates
First available in Project Euclid: 11 May 2010

Permanent link to this document
http://projecteuclid.org/euclid.aoas/1273584449

Digital Object Identifier
doi:10.1214/09-AOAS308

Zentralblatt MATH identifier
1189.62142

Mathematical Reviews number (MathSciNet)
MR2758086

Citation

Kolar, Mladen; Song, Le; Ahmed, Amr; Xing, Eric P. Estimating time-varying networks. The Annals of Applied Statistics 4 (2010), no. 1, 94--123. doi:10.1214/09-AOAS308. http://projecteuclid.org/euclid.aoas/1273584449.


Export citation

References

  • Arbeitman, M., Furlong, E., Imam, F., Johnson, E., Null, B., Baker, B., Krasnow, M., Scott, M., Davis, R. and White, K. (2002). Gene expression during the life cycle of Drosophila melanogaster. Science 297 2270–2275.
  • Banerjee, O., El Ghaoui, L. and d’Aspremont, A. (2008). Model selection through sparse maximum likelihood estimation. J. Mach. Learn. Res. 9 485–516.
  • Bresler, G., Mossel, E. and Sly, A. (2008). Reconstruction of Markov random fields from samples: Some observations and algorithms. In APPROX ’08 / RANDOM ’08: Proceedings of the 11th International Workshop, APPROX 2008, and 12th International Workshop, RANDOM 2008 on Approximation, Randomization and Combinatorial Optimization 343–356. Springer, Berlin.
  • Davidson, E. H. (2001). Genomic Regulatory Systems. Academic Press, San Diego.
  • Drton, M. and Perlman, M. D. (2004). Model selection for Gaussian concentration graphs. Biometrika 91 591–602.
  • Duchi, J., Gould, S. and Koller, D. (2008). Projected subgradient methods for learning sparse Gaussians. In Proceedings of the Twenty-fourth Conference on Uncertainty in AI (UAI) 145–152.
  • Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407–499.
  • Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
  • Fan J., Feng Y. and Wu, Y. (2009). Network exploration via the adaptive LASSO and SCAD penalties. Ann. Appl. Statist. 3 521–541.
  • Friedman, J., Hastie, J. and Tibshirani, R. (2007). Sparse inverse covariance estimation with the graphical lasso. Biostat, kxm045. Available at http://biostatistics.oxfordjournals.org/cgi/content/abstract/kxm045v1.
  • Friedman, J., Hastie, T. and Tibshirani, R. (2008). Regularization paths for generalized linear models via coordinate descent. Technical report, Dept. Statistics, Stanford Univ.
  • Friedman, J., Hastie, T., Hofling, H. and Tibshirani, R. (2007). Pathwise coordinate optimization. Ann. Appl. Statist. 1 302.
  • Getoor, L. and Taskar, B. (2007). Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning). MIT Press, Cambridge, MA.
  • Grant, M. and Boyd, S. (2008). Cvx: Matlab software for disciplined convex programming (web page and software). Available at http://stanford.edu/~boyd/cvx.
  • Guo, F., Hanneke, S., Fu, W. and Xing, E. P. (2007). Recovering temporally rewiring networks: A model-based approach. In Proceedings of the 24th International Conference on Machine Learning 321–328. ACM Press, New York.
  • Hanneke, S. and Xing E. P. (2006). Discrete temporal models of social networks. Lecture Notes in Computer Science 4503 115–125.
  • Koh, K., Kim, S.-J. and Boyd, S. (2007). An interior-point method for large-scale l1-regularized logistic regression. J. Mach. Learn. Res. 8 1519–1555.
  • Kolar, R. and Xing, E. P. (2009). Sparsistent estimation of time-varying discrete Markov random fields. ArXiv e-prints.
  • Lauritzen, S. L. (1996). Graphical Models. Oxford Univ. Press, Oxford.
  • Luscombe, N., Babu, M., Yu, H., Snyder, M., Teichmann, S. and Gerstein, M. (2004). Genomic analysis of regulatory network dynamics reveals large topological changes. Nature 431 308–312.
  • Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436.
  • Peng, J., Wang, P., Zhou, N. and Zhu, J. (2009). Partial correlation estimation by joint sparse regression models. J. Amer. Statist. Assoc. 104 735–746.
  • Ravikumar, P., Wainwright, M. J. and Lafferty, J. D. (2010). High-dimensional ising model selection using 1 regularized logistic regression. Ann. Statist. 38 1287–1319.
  • Rinaldo, A. (2009). Properties and refinements of the fused lasso. Ann. Statist. 37 2922–2952.
  • Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation. Electron. J. Statist. 2 494.
  • Sarkar, P. and Moore, A. (2006). Dynamic social network analysis using latent space models. SIGKDD Explor. Newsl. 7 31–40.
  • Tibshirani, R., Saunders, M., Rosset, S., Zhu, J. and Knight, K. (2005). Sparsity and smoothness via the fused lasso. J. Roy. Statist. Soc. Ser. B 67 91–108.
  • Tseng, P. (2001). Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109 475–494.
  • van Duijn, M. A. J., Gile, K. J. and Handcock, M. S. (2009). A framework for the comparison of maximum pseudo-likelihood and maximum likelihood estimation of exponential family random graph models. Social Networks 31 52–62.
  • Wainwright, M. J. and Jordan, M. I. (2008). Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1 1–305.
  • Wainwright, M. J. (2009). Sharp thresholds for high-dimensional and noisy sparsity recovery using 1-constrained quadratic programming (lasso). IEEE Trans. Inform. Theory 55 2183–2202.
  • Watts, D. and Strogatz, S. (1998). Collective dynamics of ‘small-world’ networks. Nature 393 440–442.
  • Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika 94 19–35.
  • Zhou, S., Lafferty, J. and Wasserman, L. (2008). Time varying undirected graphs. In Conference on Learning Theory (R. A. Servedio and T. Zhang, eds.) 455–466. Omnipress, Madison, WI.
  • Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models. Ann. Statist. 36 1509.