Electronic Journal of Statistics

Estimating networks with jumps

Mladen Kolar and Eric P. Xing

Full-text: Open access

Abstract

We study the problem of estimating a temporally varying coefficient and varying structure (VCVS) graphical model underlying data collected over a period of time, such as social states of interacting individuals or microarray expression profiles of gene networks, as opposed to i.i.d. data from an invariant model widely considered in current literature of structural estimation. In particular, we consider the scenario in which the model evolves in a piece-wise constant fashion. We propose a procedure that estimates the structure of a graphical model by minimizing the temporally smoothed L1 penalized regression, which allows jointly estimating the partition boundaries of the VCVS model and the coefficient of the sparse precision matrix on each block of the partition. A highly scalable proximal gradient method is proposed to solve the resultant convex optimization problem; and the conditions for sparsistent estimation and the convergence rate of both the partition boundaries and the network structure are established for the first time for such estimators.

Article information

Source
Electron. J. Statist., Volume 6 (2012), 2069-2106.

Dates
First available in Project Euclid: 2 November 2012

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1351865118

Digital Object Identifier
doi:10.1214/12-EJS739

Mathematical Reviews number (MathSciNet)
MR3020257

Zentralblatt MATH identifier
1295.62032

Subjects
Primary: 62G05: Estimation
Secondary: 62G20: Asymptotic properties

Keywords
Gaussian graphical models network models dynamic network models structural changes

Citation

Kolar, Mladen; Xing, Eric P. Estimating networks with jumps. Electron. J. Statist. 6 (2012), 2069--2106. doi:10.1214/12-EJS739. https://projecteuclid.org/euclid.ejs/1351865118


Export citation

References

  • [1] A. Ahmed and E. P. Xing. Recovering time-varying networks of dependencies in social and biological studies., Proc. Natl. A. Sci., 106(29) :11878–11883, 2009.
  • [2] J. Bai and P. Perron. Estimating and testing linear models with multiple structural changes., Econometrica, 66(1):47–78, 1998.
  • [3] O. Banerjee, L. El Ghaoui, and A. d’Aspremont. Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data., The Journal of Machine Learning Research, 9: 485–516, 2008a.
  • [4] O. Banerjee, L. El Ghaoui, and A. d’Aspremont. Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data., J. Mach. Learn. Res., 9:485–516, 2008b. ISSN 1533-7928.
  • [5] A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems., SIAM J. Imaging Sci., 2(1):183–202, 2009.
  • [6] P. J. Bickel and E. Levina. Regularized estimation of large covariance matrices., Ann. Stat., 36(1):199–227, 2008.
  • [7] S. Boyd and L. Vandenberghe., Convex Optimization. Cambridge University Press, 2004.
  • [8] P. Brucker. An o(n) algorithm for quadratic knapsack problems., Operations Research Letters, 3(3): 163–166, 1984.
  • [9] F. Bunea. Honest variable selection in linear and logistic regression models via $\ell_1$ and $\ell_1+\ell_2$ penalization., Electron. J. Stat., 2 :1153, 2008.
  • [10] K. R. Davidson and S. J. Szarek. Local operator theory, random matrices and Banach spaces., Handbook of the geometry of Banach spaces, 1: 317–366, 2001.
  • [11] A. P. Dempster. Covariance selection., Biometrics, 28(1):157–175, 1972.
  • [12] J. Fan, Y. Feng, and Y. Wu. Network exploration via the adaptive LASSO and SCAD penalties., Ann. Appl. Stat., 3(2):521–541, 2009.
  • [13] J. Friedman, T. Hastie, and R. Tibshirani. Sparse inverse covariance estimation with the graphical lasso., Biostat, 9(3):432–441, 2008.
  • [14] L. Getoor and B. Taskar., Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning). The MIT Press, 2007.
  • [15] J. Guo, E. Levina, G. Michailidis, and J. Zhu. Joint Structure Estimation for Categorical Markov Networks. Technical report, Department of Statistics, University of Michigan, 2010a.
  • [16] J. Guo, E. Levina, G. Michailidis, and J. Zhu. Joint Estimation of Multiple Graphical Models., Biometrika, to appear, 2010b.
  • [17] Zaïd Harchaoui and Céline Lévy-Leduc. Multiple change-point estimation with a total-variation penalty., J. Am. Stat. Soc., 105(492) :1480–1493, 2010.
  • [18] T. Hastie and R. Tibshirani. Varying-coefficient models., J. R. Stat. Soc. B, 55(4):757–796, 1993. ISSN 00359246.
  • [19] M. Kolar and E. P. Xing. Sparsistent estimation of Time-Varying discrete markov random fields. Technical report, Machine Learning Department, Carnegie Mellon University, 2009. Available at arxiv, 0907.2337.
  • [20] M. Kolar, A. P. Parikh, and E. P. Xing. On sparse nonparametric conditional covariance selection. In, Proc 27th Ann. Int’l Conf. Machine Learn., 2010a.
  • [21] M. Kolar, L. Song, A. Ahmed, and E. P. Xing. Estimating Time-Varying networks., Ann. Appl. Statist, 4(1):94—123, 2010b.
  • [22] S. L. Lauritzen., Graphical Models (Oxford Statistical Science Series). Oxford University Press, USA, 1996.
  • [23] H. Li and J. Gui. Gradient directed regularization for sparse Gaussian concentration graphs, with applications to inference of genetic networks., Biostatistics, 7(2):302, 2006.
  • [24] J. Liu, S. Wu, and J. V Zidek. On segmented multivariate regression., Stat. Sin., 7:497–526, 1997.
  • [25] K. Lounici, M. Pontil, A. B. Tsybakov, and S. van de Geer. Taking advantage of sparsity in Multi-Task learning. In, Proc. Conf. Learning Theory, 2009.
  • [26] J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online learning for matrix factorization and sparse coding., The Journal of Machine Learning Research, 11:19–60, 2010.
  • [27] E. Mammen and S. van de Geer. Locally adaptive regression splines., Ann. Statist., 25(1):387–413, 1997.
  • [28] N. Meinshausen and P. Bühlmann. High-dimensional graphs and variable selection with the lasso., Ann. Stat., 34(3) :1436–1462, 2006.
  • [29] Y. Nesterov. Smooth minimization of non-smooth functions., Math. Program., 103(1):127–152, 2005.
  • [30] Y. Nesterov. Gradient methods for minimizing composite objective function. Technical Report 76 :2007, Center for Operations Research and Econometrics (CORE), Catholic University of Louvain, 2007.
  • [31] J. Peng, P. Wang, N. Zhou, and J. Zhu. Partial correlation estimation by joint sparse regression models., J. Am. Stat. Ass., 104(486):735–746, 2009.
  • [32] P. Ravikumar, M. J. Wainwright, G. Raskutti, and B. Yu. High-dimensional covariance estimation by minimizing $\ell_1$-penalized log-determinant divergence. Technical report, Department of Statistics, University of California, Berkeley, 2008.
  • [33] P. Ravikumar, M. J. Wainwright, and J. D. Lafferty. High-dimensional ising model selection using $\ell_1$ regularized logistic regression., Ann. Stat., 38(3) :1287–1319, 2010.
  • [34] A. Rinaldo. Properties and refinements of the fused lasso., Ann. Stat., 37(5) :2922–2952, 2009.
  • [35] A. J. Rothman, P. J. Bickel, E. Levina, and J. Zhu. Sparse permutation invariant covariance estimation., Electron. J. Stat., 2:494–515, 2008.
  • [36] R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight. Sparsity and smoothness via the fused lasso., J. R. Stat. Soc. B, 67(1):91–108, 2005.
  • [37] S. van de Geer and P. Bühlmann. On the conditions used to prove oracle results for the lasso., Electron. J. Stat., 3 :1360–1392, 2009.
  • [38] M. J. Wainwright. Sharp thresholds for high-dimensional and noisy sparsity recovery using $\ell_1$ -constrained quadratic programming (lasso)., IEEE T. Inform. Theory, 55(5) :2183–2202, 2009.
  • [39] M. J. Wainwright and M. I. Jordan. Graphical models, exponential families, and variational inference., Found. Trends Mach. Learn., 1(1-2):1–305, 2008.
  • [40] P. Wang, D. L. Chao, and L. Hsu. Learning networks from high dimensional binary data: An application to genomic instability data., Biometrics, to appear, 2009.
  • [41] J. Yin, Z. Geng, R. Li, and H. Wang. Nonparametric Covariance Model., Statistica Sinica, 20:469–479, 2010.
  • [42] M. Yuan and Y. Lin. Model selection and estimation in the gaussian graphical model., Biometrika, 94(1):19–35, 2007.
  • [43] P. Zhao and B. Yu. On model selection consistency of lasso., J. Mach. Learn. Res., 7 :2541–2563, 2006.
  • [44] S. Zhou, J. Lafferty, and L. Wasserman. Time varying undirected graphs. In, Proc. Conf. Learning Theory, pages 455–466, 2008.