Bayesian Analysis

Fast Model-Fitting of Bayesian Variable Selection Regression Using the Iterative Complex Factorization Algorithm

Quan Zhou and Yongtao Guan

Full-text: Open access

Abstract

Bayesian variable selection regression (BVSR) is able to jointly analyze genome-wide genetic datasets, but the slow computation via Markov chain Monte Carlo (MCMC) hampered its wide-spread usage. Here we present a novel iterative method to solve a special class of linear systems, which can increase the speed of the BVSR model-fitting tenfold. The iterative method hinges on the complex factorization of the sum of two matrices and the solution path resides in the complex domain (instead of the real domain). Compared to the Gauss-Seidel method, the complex factorization converges almost instantaneously and its error is several magnitude smaller than that of the Gauss-Seidel method. More importantly, the error is always within the pre-specified precision while the Gauss-Seidel method is not. For large problems with thousands of covariates, the complex factorization is 10–100 times faster than either the Gauss-Seidel method or the direct method via the Cholesky decomposition. In BVSR, one needs to repetitively solve large penalized regression systems whose design matrices only change slightly between adjacent MCMC steps. This slight change in design matrix enables the adaptation of the iterative complex factorization method. The computational innovation will facilitate the wide-spread use of BVSR in reanalyzing genome-wide association datasets.

Article information

Source
Bayesian Anal., Volume 14, Number 2 (2019), 573-594.

Dates
First available in Project Euclid: 29 August 2018

Permanent link to this document
https://projecteuclid.org/euclid.ba/1535508388

Digital Object Identifier
doi:10.1214/18-BA1120

Mathematical Reviews number (MathSciNet)
MR3934098

Zentralblatt MATH identifier
07045443

Keywords
Cholesky decomposition exchange algorithm fastBVSR Gauss-Seidel method heritability

Rights
Creative Commons Attribution 4.0 International License.

Citation

Zhou, Quan; Guan, Yongtao. Fast Model-Fitting of Bayesian Variable Selection Regression Using the Iterative Complex Factorization Algorithm. Bayesian Anal. 14 (2019), no. 2, 573--594. doi:10.1214/18-BA1120. https://projecteuclid.org/euclid.ba/1535508388


Export citation

References

  • Andrieu, C. and Roberts, G. O. (2009). “The pseudo-marginal approach for efficient Monte Carlo computations.” The Annals of Statistics, 697–725.
  • Avron, H., Maymounkov, P., and Toledo, S. (2010). “Blendenpik: Supercharging LAPACK’s least-squares solver.” SIAM Journal on Scientific Computing, 32(3): 1217–1236.
  • Berger, J. O., Pericchi, L. R., Ghosh, J., Samanta, T., De Santis, F., Berger, J., and Pericchi, L. (2001). “Objective Bayesian methods for model selection: introduction and comparison.” Lecture Notes-Monograph Series, 135–207.
  • Broman, K. W. and Speed, T. P. (2002). “A model selection approach for the identification of quantitative trait loci in experimental crosses.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(4): 641–656.
  • Carbonaro, F., Andrew, T., Mackey, D., Spector, T., and Hammond, C. (2008). “Heritability of intraocular pressure: a classical twin study.” British Journal of Ophthalmology, 92(8): 1125–1128.
  • Carbonetto, P. and Stephens, M. (2012). “Scalable variational inference for Bayesian variable selection in regression, and its accuracy in genetic association studies.” Bayesian Analysis, 7(1): 73–108.
  • Chang, T. C., Congdon, N. G., Wojciechowski, R., Muñoz, B., Gilbert, D., Chen, P., Friedman, D. S., and West, S. K. (2005). “Determinants and heritability of intraocular pressure and cup-to-disc ratio in a defined older population.” Ophthalmology, 112(7): 1186–1191.
  • Draper, N. R. and Van Nostrand, R. C. (1979). “Ridge regression and James-Stein estimation: review and comments.” Technometrics, 21(4): 451–466.
  • Eldén, L. (1977). “Algorithms for the regularization of ill-conditioned least squares problems.” BIT Numerical Mathematics, 17(2): 134–145.
  • Golub, G. H. and Van Loan, C. F. (2012). Matrix computations, 3rd edition. JHU Press.
  • Gough, B. (2009). GNU scientific library reference manual. Network Theory Ltd.
  • Guan, Y. and Krone, S. M. (2007). “Small-world MCMC and Convergence to Multi-modal Distributions: From Slow Mixing to Fast Mixing.” Annals of Applied Probability, 17: 284–304.
  • Guan, Y. and Stephens, M. (2011). “Bayesian variable selection regression for genome-wide association studies and other large-scale problems.” The Annals of Applied Statistics, 1780–1815.
  • Hawkins, D. M. and Yin, X. (2002). “A faster algorithm for ridge regression of reduced rank data.” Computational Statistics & Data Analysis, 40(2): 253–262.
  • Huang, X., Wang, J., and Liang, F. (2016). “A Variational Algorithm for Bayesian Variable Selection.” arXiv preprint arXiv:1602.07640.
  • Ibrahim, J. G. and Laud, P. W. (1991). “On Bayesian analysis of generalized linear models using Jeffreys’s prior.” Journal of the American Statistical Association, 86(416): 981–986.
  • Kohno, T., Kotakemori, H., Niki, H., and Usui, M. (1997). “Improving the modified Gauss-Seidel method for Z-matrices.” Linear Algebra and its Applications, 267: 113–123.
  • Lawson, C. L. and Hanson, R. J. (1995). Solving least squares problems. SIAM.
  • Lee, S. H., Wray, N. R., Goddard, M. E., and Visscher, P. M. (2011). “Estimating missing heritability for disease from genome-wide association studies.” The American Journal of Human Genetics, 88(3): 294–305.
  • Liang, F., Paulo, R., Molina, G., Clyde, M. A., and Berger, J. O. (2008). “Mixtures of g priors for Bayesian variable selection.” Journal of the American Statistical Association, 103(481).
  • Lu, Y., Dhillon, P., Foster, D. P., and Ungar, L. (2013). “Faster ridge regression via the subsampled randomized hadamard transform.” In Advances in neural information processing systems, 369–377.
  • Meng, X., Saunders, M. A., and Mahoney, M. W. (2014). “LSRN: A parallel iterative solver for strongly over-or underdetermined systems.” SIAM Journal on Scientific Computing, 36(2): C95–C118.
  • Møller, J., Pettitt, A. N., Reeves, R., and Berthelsen, K. K. (2006). “An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants.” Biometrika, 93(2): 451–458.
  • Murray, I., Ghahramani, Z., and MacKay, D. (2012). “MCMC for doubly-intractable distributions.” arXiv preprint arXiv:1206.6848.
  • Niki, H., Harada, K., Morimoto, M., and Sakakihara, M. (2004). “The survey of preconditioners used for accelerating the rate of convergence in the Gauss–Seidel method.” Journal of Computational and Applied Mathematics, 164: 587–600.
  • O’Hagan, A. and Forster, J. J. (2004). Kendall’s advanced theory of statistics, volume 2B: Bayesian inference, volume 2. Arnold.
  • Raftery, A. E., Madigan, D., and Hoeting, J. A. (1997). “Bayesian model averaging for linear regression models.” Journal of the American Statistical Association, 92(437): 179–191.
  • Rokhlin, V. and Tygert, M. (2008). “A fast randomized algorithm for overdetermined linear least-squares regression.” Proceedings of the National Academy of Sciences, 105(36): 13212–13217.
  • Saunders, C., Gammerman, A., and Vovk, V. (1998). “Ridge regression learning algorithm in dual variables.”
  • Trefethen, L. N. and Bau III, D. (1997). Numerical linear algebra, volume 50. Siam.
  • Turlach, B. A. (2006). “An even faster algorithm for ridge regression of reduced rank data.” Computational Statistics & Data Analysis, 50(3): 642–658.
  • van Koolwijk, L. M., Despriet, D. D., van Duijn, C. M., Cortes, L. M. P., Vingerling, J. R., Aulchenko, Y. S., Oostra, B. A., Klaver, C. C., and Lemij, H. G. (2007). “Genetic contributions to glaucoma: heritability of intraocular pressure, retinal nerve fiber layer thickness, and optic disc morphology.” Investigative Ophthalmology & Visual Science, 48(8): 3669–3676.
  • Weinreb, R., Aung, T., and Medeiros, F. (2014). “The pathophysiology and treatment of glaucoma: A review.” JAMA, 311(18): 1901–1911. URL + http://dx.doi.org/10.1001/jama.2014.3192
  • Xu, H. and Guan, Y. (2014). “Detecting Local Haplotype Sharing and Haplotype Association.” Genetics, 197(3): 823–838.
  • Yang, J., Benyamin, B., McEvoy, B. P., Gordon, S., Henders, A. K., Nyholt, D. R., Madden, P. A., Heath, A. C., Martin, N. G., Montgomery, G. W., et al. (2010). “Common SNPs explain a large proportion of the heritability for human height.” Nature Genetics, 42(7): 565–569.
  • Yang, J., Lee, S. H., Goddard, M. E., and Visscher, P. M. (2011). “GCTA: a tool for genome-wide complex trait analysis.” The American Journal of Human Genetics, 88(1): 76–82.
  • Yang, S. and Matthias, K. G. (2007). “The optimal relaxation parameter for the SOR method applied to a classical model problem.” Technical report, Technical Report number TR2007-6, Department of Mathematics and Statistics, University of Maryland, Baltimore County.
  • Youla, D. C. (1961). “A normal form for a matrix under the unitary congruence group.” Canadian Journal of Mathematics, 13: 694–704.
  • Young, D. (1954). “Iterative methods for solving partial difference equations of elliptic type.” Transactions of the American Mathematical Society, 76(1): 92–111.
  • Zhou, Q. and Guan, Y. (2017). “On the Null Distribution of Bayes Factors in Linear Regression.” Journal of American Statistical Association, to appear.
  • Zhou, Q. and Guan, Y. (2018). “Fast model-fitting of Bayesian variable selection regression using the iterative complex factorization algorithm (Supplementary).” Bayesian Analysis.
  • Zhou, X., Carbonetto, P., and Stephens, M. (2013). “Polygenic modeling with Bayesian sparse linear mixed models.” PLoS Genet, 9(2): e1003264.

Supplemental materials

  • Fast model-fitting of Bayesian variable selection regression using the iterative complex factorization algorithm (Supplementary). The online appendix includes additional numerical studies (S1–S3), a summary of the MCMC algorithm (S4), examples for updating the Cholesky decomposition (S5) and a proof for the stationary distribution of the exchange algorithm (S6). S1 compares the performance of the iterative methods using simulated datasets; S2 studies the behavior of the convergence rates of the iterative methods; S3 provides numerical evidence for the assumption $\eta_min^{2} \approx 0$ used in Section 2.3.