Statistical Science

Graphics Processing Units and High-Dimensional Optimization

Hua Zhou, Kenneth Lange, and Marc A. Suchard

Full-text: Open access

Abstract

This article discusses the potential of graphics processing units (GPUs) in high-dimensional optimization problems. A single GPU card with hundreds of arithmetic cores can be inserted in a personal computer and dramatically accelerates many statistical algorithms. To exploit these devices fully, optimization algorithms should reduce to multiple parallel tasks, each accessing a limited amount of data. These criteria favor EM and MM algorithms that separate parameters and data. To a lesser extent block relaxation and coordinate descent and ascent also qualify. We demonstrate the utility of GPUs in nonnegative matrix factorization, PET image reconstruction, and multidimensional scaling. Speedups of 100-fold can easily be attained. Over the next decade, GPUs will fundamentally alter the landscape of computational statistics. It is time for more statisticians to get on-board.

Article information

Source
Statist. Sci. Volume 25, Number 3 (2010), 311-324.

Dates
First available in Project Euclid: 4 January 2011

Permanent link to this document
https://projecteuclid.org/euclid.ss/1294167962

Digital Object Identifier
doi:10.1214/10-STS336

Mathematical Reviews number (MathSciNet)
MR2791670

Zentralblatt MATH identifier
1329.62028

Keywords
Block relaxation EM and MM algorithms multidimensional scaling nonnegative matrix factorization parallel computing PET scanning

Citation

Zhou, Hua; Lange, Kenneth; Suchard, Marc A. Graphics Processing Units and High-Dimensional Optimization. Statist. Sci. 25 (2010), no. 3, 311--324. doi:10.1214/10-STS336. https://projecteuclid.org/euclid.ss/1294167962.


Export citation

References

  • [1] Ayers, K. L. and Lange, K. L (2008). Penalized estimation of haplotype frequencies. Bioinformatics 24 1596–1602.
  • [2] Berry, M. W., Browne, M., Langville, A. N., Pauca, V. P. and Plemmons, R. J. (2007). Algorithms and applications for approximate nonnegative matrix factorization. Comput. Statist. Data Anal. 52 155–173.
  • [3] Buckner, J., Wilson, J., Seligman, M., Athey, B., Watson, S. and Meng, F. (2010). The gputools package enables GPU computing in R. Bioinformatics 26 134–135.
  • [4] de Leeuw, J. (1992). Fitting distances by least squares. Unpublished manuscript.
  • [5] de Leeuw, J. and Heiser, W. J. (1977). Convergence of correction matrix algorithms for multidimensional scaling. In Geometric Representations of Relational Data 133–145. Mathesis Press, Ann Arbor, MI.
  • [6] de Leeuw, J. (1977). Applications of convex analysis to multidimensional scaling. In Recent Developments in Statistics (Proc. European Meeting Statisticians, Grenoble, 1976) 133–145. North-Holland, Amsterdam.
  • [7] de Leeuw, J. (1994). Block relaxation algorithms in statistics. In Information Systems and Data Analysis. Springer, Berlin.
  • [8] Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. Roy. Statist. Soc. Ser. B 39 1–38.
  • [9] Diaconis, P., Goel, S. and Holmes, S. (2008). Horseshoes in multidimensional scaling and local kernel methods. Ann. Appl. Statist. 2 777–807.
  • [10] Groenen, P. J. F. and Heiser, W. J. (1996). The tunneling method for global optimization in multidimensional scaling. Pshychometrika 61 529–550.
  • [11] Jamshidian, M. and Jennrich, R. I. (1993). Conjugate gradient acceleration of the EM algorithm. J. Amer. Statist. Assoc. 88 221–228.
  • [12] Jamshidian, M. and Jennrich, R. I. (1997). Acceleration of the EM algorithm by using quasi-Newton methods. J. Roy. Statist. Soc. Ser. B 59 569–587.
  • [13] Koren, Y., Bell, R. and Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer 42 30–37.
  • [14] Lange, K. L. and Carson, R. (1984). EM reconstruction algorithms for emission and transmission tomography. J. Comput. Assist. Tomogr. 8 306–316.
  • [15] Lange, K. L. (1995). A quasi-Newton acceleration of the EM algorithm. Statist. Sinica 5 1–18.
  • [16] Lange, K. L. (2004). Optimization. Springer, New York.
  • [17] Lange, K. L., Hunter, D. R. and Yang, I. (2000). Optimization transfer using surrogate objective functions (with discussion). J. Comput. Graph. Statist. 9 1–59.
  • [18] Lee, A., Yan, C., Giles, M. B., Doucet, A. and Holmes, C. C. (2009). On the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods. Technical report, Dept. Statistics, Oxford Univ.
  • [19] Lee, D. D. and Seung, H. S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature 401 788–791.
  • [20] Lee, D. D. and Seung, H. S. (2001). Algorithms for non-negative matrix factorization. In NIPS 556–562. MIT Press, Cambridge, MA.
  • [21] Liu, C. and Rubin, D. B. (1994). The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence. Biometrika 81 633–648.
  • [22] McLachlan, G. J. and Krishnan, T. (2008). The EM Algorithm and Extensions, 2nd ed. Wiley, Hoboken, NJ.
  • [23] Meng, X. L. and Rubin, D. B. (1993). Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika 80 267–278.
  • [24] Meng, X. L. and van Dyk, D. (1997). The EM algorithm—an old folk-song sung to a fast new tune (with discussion). J. Roy. Statist. Soc. Ser. B 59 511–567.
  • [25] MIT center for biological and computational learning. CBCL Face Database #1. Available at http://cbcl.mit.edu/.
  • [26] NVIDIA (2008). NVIDIA CUBLAS Library.
  • [27] NVIDIA (2008). NVIDIA CUDA Compute Unified Device Architecture: Programming Guide Version 2.0.
  • [28] Owens, J. D., Luebke, D., Govindaraju, N., Harris, M., Krüger, J., Lefohn, A. E. and Purcell, T. J. (2007). A survey of general-purpose computation on graphics hardware. Computer Graphics Forum 26 80–113.
  • [29] Ranola, J. M., Ahn, S., Sehl, M. E., Smith, D. J. and Lange, K. L. (2010). A Poisson model for random multigraphs. Bioinformatics 26 2004–2011.
  • [30] Roland, C., Varadhan, R. and Frangakis, C. E. (2007). Squared polynomial extrapolation methods with cycling: An application to the positron emission tomography problem. Numer. Algorithms 44 159–172.
  • [31] Silberstein, M., Schuster, A., Geiger, D., Patney, A. and Owens, J. D. (2008). Efficient computation of sum-products on GPUs through software-managed cache. In Proceedings of the 22nd Annual International Conference on Supercomputing 309–318. ACM, New York.
  • [32] Sinnott-Armstrong, N. A., Greene, C. S., Cancare, F. and Moore, J. H. (2009). Accelerating epistasis analysis in human genetics with consumer graphics hardware. BMC Res. Notes 2 149.
  • [33] Suchard, M. A. and Rambaut, A. (2009). Many-core algorithms for statistical phylogenetics. Bioinformatics 25 1370–1376.
  • [34] Suchard, M. A., Holmes, C. and West, M. (2010). Some of the what?, why?, how?, who and where? of graphics processing unit computing for Bayesian analysis. ISBA Bull. 17 12–16.
  • [35] Suchard, M. A, Wang, Q., Chan, C., Frelinger, A. and West, M. (2010). Understanding GPU programming for statistical computation: Studies in massively parallel massive mixtures. J. Comput. Graph. Statist. 19 418–438.
  • [36] Tibbits, M. M., Haran, M. and Liechty, J. C. (2010). Parallel multivariate slice sampling. Statist. Comput. To appear.
  • [37] Ueda, N. and Nakano, R. (1998). Deterministic annealing EM algorithm. Neural Networks 11 271–282.
  • [38] Varadhan, R. and Roland, C. (2008). Simple and globally convergent methods for accelerating the convergence of any EM algorithm. Scand. J. Statist. 35 335–353.
  • [39] Vardi, Y., Shepp, L. A. and Kaufman, L. (1985). A statistical model for positron emission tomography (with discussion). J. Amer. Statist. Assoc. 80 8–37.
  • [40] Wu, T. T. and Lange, K. L. (2010). The MM alternative to EM. Statist. Sci. To appear.
  • [41] Zhou, H. and Lange K. L. (2009). On the bumpy road to the dominant mode. Scand. J. Statist. DOI: 10.1111/j.1467-9469.2009.00681.x.
  • [42] Zhou, H., Alexander, D. and Lange, K. L. (2009). A quasi-Newton acceleration for high-dimensional optimization algorithms. Statist. Comput. DOI: 10.1007/s11222-009-9166-3.
  • [43] Zhou, H. and Lange, K. L. (2010). MM algorithms for some discrete multivariate distributions. J. Comput. Graph. Statist. 19 645–665.