The Annals of Statistics

On the computational complexity of MCMC-based estimators in large samples

Alexandre Belloni and Victor Chernozhukov

Full-text: Open access

Abstract

In this paper we examine the implications of the statistical large sample theory for the computational complexity of Bayesian and quasi-Bayesian estimation carried out using Metropolis random walks. Our analysis is motivated by the Laplace–Bernstein–Von Mises central limit theorem, which states that in large samples the posterior or quasi-posterior approaches a normal density. Using the conditions required for the central limit theorem to hold, we establish polynomial bounds on the computational complexity of general Metropolis random walks methods in large samples. Our analysis covers cases where the underlying log-likelihood or extremum criterion function is possibly nonconcave, discontinuous, and with increasing parameter dimension. However, the central limit theorem restricts the deviations from continuity and log-concavity of the log-likelihood or extremum criterion function in a very specific manner.

Under minimal assumptions required for the central limit theorem to hold under the increasing parameter dimension, we show that the Metropolis algorithm is theoretically efficient even for the canonical Gaussian walk which is studied in detail. Specifically, we show that the running time of the algorithm in large samples is bounded in probability by a polynomial in the parameter dimension d and, in particular, is of stochastic order d2 in the leading cases after the burn-in period. We then give applications to exponential families, curved exponential families and Z-estimation of increasing dimension.

Article information

Source
Ann. Statist., Volume 37, Number 4 (2009), 2011-2055.

Dates
First available in Project Euclid: 18 June 2009

Permanent link to this document
https://projecteuclid.org/euclid.aos/1245332839

Digital Object Identifier
doi:10.1214/08-AOS634

Mathematical Reviews number (MathSciNet)
MR2533478

Zentralblatt MATH identifier
1175.65015

Subjects
Primary: 65C05: Monte Carlo methods
Secondary: 65C60: Computational problems in statistics

Keywords
Markov chain Monte Carlo computational complexity Bayesian increasing dimension

Citation

Belloni, Alexandre; Chernozhukov, Victor. On the computational complexity of MCMC-based estimators in large samples. Ann. Statist. 37 (2009), no. 4, 2011--2055. doi:10.1214/08-AOS634. https://projecteuclid.org/euclid.aos/1245332839


Export citation

References

  • [1] Atchade, Y. F. (2006). An adaptive version for the Metropolis adjusted Langevin algorithm with a truncated drift. Methodol. Comput. Appl. Probab. 8 235–254.
  • [2] Applegate, D. and Kannan, R. (1993). Sampling and integration of near logconcave functions. In Proceedings 23th ACM STOC 156–163.
  • [3] Barndorff-Nielsen, O. (1978). Information and Exponential Families in Statistical Theory. Wiley, Chichester.
  • [4] Belloni, A. and Chernozhukov, V. (2007). Posterior inference in curved exponential families under increasing dimensions. Technical report. MIT and Duke.
  • [5] Belloni, A. and Chernozhukov, V. (2008). Conditional quantile processes under increasing dimension. Technical report. MIT and Duke.
  • [6] Bickel, P. J. and Yahav, J. A. (1969). Some contributions to the asymptotic theory of Bayes solutions. Z. Wahrsch. Verw. Gebiete 11 257–276.
  • [7] Blackwell, D. (1985). Approximate normality of large products. Technical report 54. Dept. of Statistics. Univ. California, Berkeley.
  • [8] Bunke, O. and Milhaud, X. (1998). Asymptotic behavior of Bayes estimates under possibly incorrect models. Ann. Statist. 26 617–644.
  • [9] Casella, G. and Robert, C. P. (1999). Monte Carlo Statistical Methods. Springer, New York.
  • [10] Chamberlain, G. (1987). Asymptotic efficiency in estimation with conditional moment restrictions. J. Econometrics 34 305–334.
  • [11] Chernozhukov, V. and Hong, H. (2003). An MCMC approach to classical estimation. J. Econometrics 115 293–346.
  • [12] Chib, S. (2001). Markov chain Monte Carlo methods: Computation and inference. In Handbook of Econometrics 5 (J. J. Heckman and E. Leamer, eds.) 3569–3649. Elsevier, Amsterdam.
  • [13] Donald, S. G., Imbens, G. W. and Newey, W. K. (2003). Empirical likelihood estimation and consistent tests with conditional moment restrictions. J. Econometrics 117 55–93.
  • [14] Efron, B. (1978). The geometry of exponential families. Ann. Statist. 6 362–376.
  • [15] Fishman, G. S. (1994). Choosing sample path length and number of sample paths when starting at steady state. Operations Research Letters 16 209–220.
  • [16] Frieze, A., Kannan, R. and Polson, N. (1994). Sampling from log-concave functions. Ann. Appl. Probab. 4 812–834.
  • [17] Gelman, A., Roberts, G. O. and Gilks, W. R. (1996). Efficient Metropolis jumping rules. In Bayesian Statistics V (J. M. Bernardo, ed.) 599–607. Oxford Univ. Press, Oxford.
  • [18] Geweke, J. and Keane, M. (2001). Computationally intensive methods for integration in econometrics. In Handbook of Econometrics 5 (J. J. Heckman and E. Leamer, eds.) 3463–3568. Elsevier, Amsterdam.
  • [19] Geyer, C. J. (1992). Practical Markov chain Monte Carlo. Statist. Sci. 7 473–511.
  • [20] Ghosal, S. (2000). Asymptotic normality of posterior distributions for exponential families when the number of parameters tends to infinity. J. Multivariate Anal. 73 49–68.
  • [21] Hansen, L. P. and Singleton, K. J. (1982). Generalized instrumental variables estimation of nonlinear rational expectations models. Econometrica 50 1269–1286.
  • [22] He, X. and Shao, Q.-M. (2000). On parameters of increasing dimensions. J. Multivariate Anal. 73 120–135.
  • [23] Huber, P. J. (1993). Robust regression: Asymptotics, conjectures and Monte Carlo. Ann. Statist. 1 799–821.
  • [24] Ibragimov, I. and Has’minskii, R. (1981). Statistical Estimation: Asymptotic Theory. Springer, Berlin.
  • [25] Imbens, G. W. (1997). One-step estimators for over-identified generalized method of moments models. Rev. Econom. Stud. 64 359–383.
  • [26] Jerrum, M. and Sinclair, A. (1988). Conductance and the rapid mixing property for Markov chains: The approximation of permanent resolved. In Proceedings of the 20th Annual ACM symposium on Theory of Computing 235–244.
  • [27] Jerrum, M. and Sinclair, A. (1989). Approximating the permanent. SIAM J. Comput. 18 1149–1178.
  • [28] Kannan, R. and Li, G. (1996). Sampling according to the multivariate normal density. In 37th Annual Symposium on Foundations of Computer Science (FOCS’96) 204.
  • [29] Kannan, R., Lovász, L. and Simonovits, M. (1995). Isoperimetric problems for convex bodies and a localization lemma. J. Discrete. Comput. Goem. 13 541–559.
  • [30] Kannan, R., Lovász, L. and Simonovits, M. (1997). Random walks and an O*(n5) volume algorithm for convex bodies. Random Structures Algorithms 11 1–50.
  • [31] Kipnis, C. and Varadhan, S. R. S. (1988). Central limit theorem for additive functionals of reversible processes and applications to simple exclusions. Comm. Math. Phys. 104 1–19.
  • [32] Koenker, R. (1988). Asymptotic theory and econometric practice. J. Appl. Econometrics 3 139–147.
  • [33] Lehmann, E. L. and Casella, G. (1998). Theory of Point Estimation, 2nd ed. Springer, New York.
  • [34] Liu, J. S. (2001). Monte Carlo Strategies in Scientific Computing. Springer, New York.
  • [35] Lovász, L. (1999). Hit-and-run mixes fast. Math. Program. 86 443–461.
  • [36] Lovász, L. and Simonovits, M. (1993). Random walks in convex bodies and an improved volume algorithm. Random Structures Algorithms 4 359–412.
  • [37] Lovász, L. and Vempala, S. (2007). The geometry of logconcave functions and sampling algorithms. Random Structures Algorithms 30 307–358.
  • [38] Lovász, L. and Vempala, S. (2003). Hit-and-run is fast and fun. Technical Report MSR-TR-2003-05.
  • [39] Lovász, L. and Vempala, S. (2006). Hit-and-run from a corner. SIAM J. Comput. 35 985–1005.
  • [40] Polson, N. (1996). Convergence of Markov chain Monte Carlo Algorithms. In Bayesian Statistics 5 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.). Oxford Univ. Press.
  • [41] Portnoy, S. (1988). Asymptotic behavior of likelihood methods for exponential families when the number of parameters tends to infinity. Ann. Statist. 16 356–366.
  • [42] Roberts, G. O., Gelman, A. and Gilks, W. R. (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7 110–120.
  • [43] Roberts, G. O. and Rosenthal, J. S. (2001). Optimal scaling for various Metropolis–Hastings algorithms. Statist. Sci. 16 351–367.
  • [44] Roberts, G. O. and Tweedie, R. L. (1996). Exponential convergence of Langevin diffusions and their discrete approximations. Bernoulli 2 341–364.
  • [45] Shen, X. (2002). Asymptotic normality of semiparametric and nonparametric posterior distributions. J. Amer. Statist. Assoc. 97 222–235.
  • [46] Stone, C. J., Hansen, M. H., Kooperberg, C. and Truong, Y. K. (1997). Polynomial splines and their tensor products in extended linear modeling. With discussion and a rejoinder by the authors and Jianhua Z. Huang. Ann. Statist. 25 1371–1470.
  • [47] Stramer, O. and Tweedie, R. L. (1999). Langevin-type models I: Diffusions with given stationary distributions and their discretizations. Methodol. Comput. Appl. Probab. 1 283–306.
  • [48] Talagrand, M. (1994). Sharper bounds for Gaussian and empirical processes. Ann. Probab. 22 28–76.
  • [49] Tian, L., Liu, J. S. and Wei, L. J. (2007). Implementation of estimating-function based inference procedures with Markov chain Monte Carlo samplers. J. Amer. Statist. Assoc. 102 881–888.
  • [50] van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer, New York.
  • [51] Vempala, S. (2005). Geometric random walks: A survey. In Combinatorial and Computational Geometry 577–616. MSRI Publications 52. Cambridge Univ. Press, Cambridge.