Electronic Journal of Statistics

Sparse covariance estimation in heterogeneous samples

Abel Rodríguez, Alex Lenkoski, and Adrian Dobra

Full-text: Open access

Abstract

Standard Gaussian graphical models implicitly assume that the conditional independence among variables is common to all observations in the sample. However, in practice, observations are usually collected from heterogeneous populations where such an assumption is not satisfied, leading in turn to nonlinear relationships among variables. To address such situations we explore mixtures of Gaussian graphical models; in particular, we consider both infinite mixtures and infinite hidden Markov models where the emission distributions correspond to Gaussian graphical models. Such models allow us to divide a heterogeneous population into homogenous groups, with each cluster having its own conditional independence structure. As an illustration, we study the trends in foreign exchange rate fluctuations in the pre-Euro era.

Article information

Source
Electron. J. Statist., Volume 5 (2011), 981-1014.

Dates
First available in Project Euclid: 15 September 2011

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1316092866

Digital Object Identifier
doi:10.1214/11-EJS634

Mathematical Reviews number (MathSciNet)
MR2836767

Zentralblatt MATH identifier
1274.62207

Subjects
Primary: 62F15: Bayesian inference 62H25: Factor analysis and principal components; correspondence analysis
Secondary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20] 62M10: Time series, auto-correlation, regression, etc. [See also 91B84]

Keywords
Covariance selection Gaussian graphical model mixture model Dirichlet process hidden Markov model nonparametric Bayes inference

Citation

Rodríguez, Abel; Lenkoski, Alex; Dobra, Adrian. Sparse covariance estimation in heterogeneous samples. Electron. J. Statist. 5 (2011), 981--1014. doi:10.1214/11-EJS634. https://projecteuclid.org/euclid.ejs/1316092866


Export citation

References

  • [1] Antoniak, C. (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems., Annals of Statistics 2, 1152–1174.
  • [2] Armstrong, H., Carter, C. K., Wong, K. F. & Kohn, R. (2009). Bayesian covariance matrix estimation using a mixture of decomposable graphical models., Statistics and Computing 19, 303–316.
  • [3] Atay-Kayis, A. & Massam, H. (2005). A Monte Carlo method for computing the marginal likelihood in nondecomposable Gaussian graphical models., Biometrika 92, 317–35.
  • [4] Beal, M. J., Ghahramani, Z. & Rasmussen, C. E. (2001). The infinite hidden markov model. In, Proceedings of Fourteenth Annual Conference on Neural Information Processing Systems.
  • [5] Bedford, T. & Cooke, R. M. (2002). Vines - a new graphical model for dependent random variables., Annals of Statistics 30, 1031–1068.
  • [6] Berger, J. O. & Molina, G. (2005). Posterior model probabilities via path-based pairwise priors., Statistica Neerlandica 59, 3–15.
  • [7] Blackwell, D. & MacQueen, J. B. (1973). Ferguson distribution via Pólya urn schemes., The Annals of Statistics 1, 353–355.
  • [8] Cappé, O., Moulines, E. & Ryden, T. (2005)., Inference in Hidden Markov Models. Springer.
  • [9] Carvalho, C. M., Massam, H. & West, M. (2007). Simulation of hyper-inverse Wishart distributions in graphical models., Biometrika 94, 647–659.
  • [10] Carvalho, C. M. & West, M. (2007). Dynamic matrix-variate graphical models., Bayesian Analysis 2, 69–98.
  • [11] Castelo, R. & Roverato, A. (2006). A robust procedure for Gaussian graphical model search from microarray data with p larger than n., Journal of Machine Learning Research 7, 2621–2650.
  • [12] Dawid, A. P. & Lauritzen, S. L. (1993). Hyper Markov laws in the statistical analysis of decomposable graphical models., Annals of Statistics 21, 1272–1317.
  • [13] Dempster, A. P. (1972). Covariance selection., Biometrics 28, 157–75.
  • [14] Diaconnis, P. & Ylvisaker, D. (1979). Conjugate priors for exponential families., Annals of Statistics 7, 269–81.
  • [15] Dobra, A., Eicher, T. & Lenkoski, A. (2010). Modeling uncertainty in macroeconomic growth determinants using gaussian graphical models., Statistical Methodology 7, 292–306.
  • [16] Dobra, A., Hans, C., Jones, B., Nevins, J. R., Yao, G. & West, M. (2004). Sparse graphical models for exploring gene expression data., Journal of Multivariate Analysis 90, 196–212.
  • [17] Dobra, A., Lenkoski, A. & Rodríguez, A. (2011). Bayesian inference for general Gaussian graphical models with application to multivariate lattice data., Journal of the American Statistical Association To appear.
  • [18] Escobar, M. D. & West, M. (1995). Bayesian density estimation and inference using mixtures., Journal of the American Statistical Association 90, 577–588.
  • [19] Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems., Annals of Statistics 1, 209–230.
  • [20] Ferguson, T. S. (1974). Prior distributions on spaces of probability measures., Annals of Statistics 2 , 615–629.
  • [21] Fraley, C. & Raftery, A. E. (2007). Bayesian regularization for normal mixture estimation and model-based clustering., Journal of Classification 24, 155–181.
  • [22] Friedman, N. (2004). Inferring cellular networks using probabilistic graphical models., Science 6, 799–805.
  • [23] van Gael, J., Saatci, Y., Teh, Y.-W. & Ghahramani, Z. (2008). Beam sampling for the infinite hidden markov model. In, Proceedings of the 25th International Conference on Machine Learning (ICML).
  • [24] Green, P. & Richardson, S. (2001). Modelling heterogeneity with and without the Dirichlet process., Scandinavian Journal of Statistics 28, 355–375.
  • [25] Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination., Biometrika 82.
  • [26] Guo, J., Levina, E., Michailidis, G. & Zhu (2011). Joint estimation of multiple graphical models., Biometrika 98, 1–15.
  • [27] Heinz, D. (2009). Building hyper Dirichlet processes for gaphical models., Electronic Journal of Statistics 3, 290–315.
  • [28] Ishwaran, H. & James, L. F. (2001). Gibbs sampling methods for stick-breaking priors., Journal of the American Statistical Association 96, 161–173.
  • [29] Ishwaran, H. & Zarepour, M. (2002). Dirichlet prior sieves in finite normal mixtures., Statistica Sinica 12, 941–963.
  • [30] Jain, S. & Neal, R. M. (2004). A split-merge Markov chain Monte Carlo procedure for the Dirichlet process mixture model., Journal of Graphical and Computational Statistics 13, 158–182.
  • [31] Jain, S. & Neal, R. M. (2007). Splitting and merging components of a nonconjugate dirichlet process mixture model., Bayesian Analysis 2, 445–472.
  • [32] Jones, B., Carvalho, C., Dobra, A., Hans, C., Carter, C. & West, M. (2005). Experiments in stochastic computation for high-dimensional graphical models., Statististical Science 20, 388–400.
  • [33] Lau, J. W. & Green, P. (2007). Bayesian model based clustering procedures., Journal of Computational and Graphical Statistics 16, 526–558.
  • [34] Lauritzen, S. L. (1996)., Graphical Models. Oxford University Press.
  • [35] Lee, J., Müller, P., Trippa, L. & Quintana, F. A. (2009). Defining predictive probability functions for species sampling models. Technical report, Pontificia Universidad Católica de, Chile.
  • [36] Lenkoski, A. & Dobra, A. (2011). Computational aspects related to inference in Gaussian graphical models with the G-wishart prior., Journal of Computational and Graphical Statistics 20, 140–157.
  • [37] Letac, G. & Massam, H. (2007). Wishart distributions for decomposable graphs., Annals of Statistics 35, 1278–323.
  • [38] Liu, J. S., Liang, F. & Wong, W. H. (2000). The use of multiple-try method and local optimization in metropolis sampling., Journal of the American Statistical Association 95, 121–134.
  • [39] Lo, A. Y. (1984). On a class of Bayesian nonparametric estimates: I. Density estimates., Annals of Statistics 12, 351–357.
  • [40] Muirhead, R. J. (2005)., Aspects of Multivariate Statistical Theory. John Wiley & Sons.
  • [41] Müller, P., Erkanli, A. & West, M. (1996). Bayesian curve fitting using multivariate normal mixtures., Biometrika 83, 67–79.
  • [42] Müller, P., Quintana, F. & Rosner, G. (2004). Hierarchical meta-analysis over related non-parametric Bayesian models., Journal of the Royal Statistical Society, Series B 66, 735–749.
  • [43] Neal, R. M. (2000). Markov chain sampling methods for Dirichlet process mixture models., Journal of Computational and Graphical Statistics 9, 249–265.
  • [44] Ongaro, A. & Cattaneo, C. (2004). Discrete random probability measures: a general framework for nonparametric Bayesian inference., Statistics and Probability Letters 67, 33–45.
  • [45] Pitman, J. (1995). Exchangeable and partially exchangeable random partitions., Probability Theory and Related Fields 102, 145–158.
  • [46] Quintana, F. & Iglesias, P. L. (2003). Bayesian clustering and product partition models., Journal of the Royal Statistical Society, Series B. 65, 557–574.
  • [47] Roberts, G. & Papaspiliopoulos, O. (2008). Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models., Biometrika 95, 169–186.
  • [48] Rodríguez, A., Dunson, D. B. & Gelfand, A. E. (2009). Bayesian nonparametric functional data analysis through density estimation., Biometrika 96, 149–162.
  • [49] Rodríguez, A. & Vuppala, R. (2009). Probabilistic classification using Bayesian nonparametric mixture models. Technical report, University of California, Santa, Cruz.
  • [50] Roverato, A. (2002). Hyper inverse Wishart distribution for non-decomposable graphs and its application to Bayesian inference for Gaussian graphical models., Scandinavian Journal of Statistics 29, 391–411.
  • [51] Scott, J. G. & Carvalho, C. M. (2008). Feature-inclusion stochastic search for Gaussian graphical models., Journal of Computational and Graphical Statistics 17, 790–808.
  • [52] Sethuraman, J. (1994). A constructive definition of Dirichelt priors., Statistica Sinica 4, 639–650.
  • [53] Stephens, M. (2000). Dealing with label switching in mixture models., Journal of the Royal Statistical Society, Series B. 62, 795–809.
  • [54] Teh, Y. W., Jordan, M. I., Beal, M. J. & Blei, D. M. (2006). Sharing clusters among related groups: Hierarchical Dirichlet processes., Journal of the American Statistical Association 101, 1566–1581.
  • [55] Thiesson, B., Meek, C., Chickering, D. M. & Heckerman, D. (1997). Learning mixtures of DAG models. In, Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 504–513. Morgan Kaufmann, Inc.
  • [56] Wainwright, M. J., Ravikumar, P. & Lafferty, J. D. (2006). High-dimensional graphical model selection using, 1-regularized logistic regression. In In Neural Information Processing Systems. MIT Press.
  • [57] Walker, S. G. (2007). Sampling the dirichlet mixture model with slices., Communications in Statistics - Simulation and Computation 36, 45–54.
  • [58] Wang, H. & Carvalho, C. M. (2010). Simulation of hyper-inverse Wishart distributions for non-decomposable graphs., Electronic Journal of Statistics 4, 1470–1475.
  • [59] Wang, H., Reeson, C. & Carvalho, C. M. (2011). Dynamic Financial Index Models: Modeling Conditional Dependences via Graphs., Bayesian Analysis To appear.
  • [60] Wang, H. & West, M. (2009). Bayesian analysis of matrix normal graphical models., Biometrika 96, 821–834.
  • [61] West, M., Blanchette, H., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H., Olson, J. A., Marks, J. R. & Nevings, J. R. (2001). Predicting the clinical status of human breast cancer by using gne expression profiles., Proceedings of the National Academi of Sciences 98, 11462–11467.
  • [62] West, M. & Harrison, J. (1997)., Bayesian Forecasting and Dynamic Models. Springer - Verlag, New York, 2nd edition.