## The Annals of Applied Probability

### The sample size required in importance sampling

#### Abstract

The goal of importance sampling is to estimate the expected value of a given function with respect to a probability measure $\nu$ using a random sample of size $n$ drawn from a different probability measure $\mu$. If the two measures $\mu$ and $\nu$ are nearly singular with respect to each other, which is often the case in practice, the sample size required for accurate estimation is large. In this article, it is shown that in a fairly general setting, a sample of size approximately $\exp(D(\nu\parallel\mu))$ is necessary and sufficient for accurate estimation by importance sampling, where $D(\nu\parallel\mu)$ is the Kullback–Leibler divergence of $\mu$ from $\nu$. In particular, the required sample size exhibits a kind of cut-off in the logarithmic scale. The theory is applied to obtain a general formula for the sample size required in importance sampling for one-parameter exponential families (Gibbs measures).

#### Article information

Source
Ann. Appl. Probab., Volume 28, Number 2 (2018), 1099-1135.

Dates
Received: November 2015
Revised: June 2017
First available in Project Euclid: 11 April 2018

Permanent link to this document
https://projecteuclid.org/euclid.aoap/1523433632

Digital Object Identifier
doi:10.1214/17-AAP1326

Mathematical Reviews number (MathSciNet)
MR3784496

Zentralblatt MATH identifier
06897951

#### Citation

Chatterjee, Sourav; Diaconis, Persi. The sample size required in importance sampling. Ann. Appl. Probab. 28 (2018), no. 2, 1099--1135. doi:10.1214/17-AAP1326. https://projecteuclid.org/euclid.aoap/1523433632

#### References

• [1] Agapiou, S., Papaspiliopoulos, O., Sanz-Alonso, D. and Stuart, A. M. (2017). Importance sampling: Computational complexity and intrinsic dimension. Preprint. Available at arXiv:1511.06196.
• [2] Asmussen, S. and Glynn, P. W. (2007). Stochastic Simulation: Algorithms and Analysis. Stochastic Modelling and Applied Probability 57. Springer, New York.
• [3] Bahadur, R. R. (1960). Some approximations to the binomial distribution function. Ann. Math. Stat. 31 43–54.
• [4] Bassetti, F. and Diaconis, P. (2006). Examples comparing importance sampling and the Metropolis algorithm. Illinois J. Math. 50 67–91.
• [5] Baxter, R. J. (1982). Exactly Solved Models in Statistical Mechanics. Academic Press, London.
• [6] Bhattacharya, B. B., Ganguly, S., Lubetzky, E. and Zhao, Y. (2015). Upper tails and independence polynomials in random graphs. Preprint. Available at arXiv:1507.04074.
• [7] Blanchet, J. and Glynn, P. (2008). Efficient rare-event simulation for the maximum of heavy-tailed random walks. Ann. Appl. Probab. 18 1351–1378.
• [8] Blanchet, J., Glynn, P. and Leder, K. (2012). On Lyapunov inequalities and subsolutions for efficient importance sampling. ACM Trans. Model. Comput. Simul. 22 1104–1128.
• [9] Blanchet, J. and Liu, J. (2008). State-dependent importance sampling for regularly varying random walks. Adv. Appl. Probab. 40 1104–1128.
• [10] Blanchet, J. and Liu, J. (2010). Efficient importance sampling in ruin problems for multidimensional regularly varying random walks. J. Appl. Probab. 47 301–322.
• [11] Blitzstein, J. and Diaconis, P. (2010). A sequential importance sampling algorithm for generating random graphs with prescribed degrees. Internet Math. 6 489–522.
• [12] Bousquet-Mélou, M. (2014). On the importance sampling of self-avoiding walks. Combin. Probab. Comput. 23 725–748.
• [13] Cappé, O., Moulines, E. and Rydén, T. (2005). Inference in Hidden Markov Models. Springer, New York.
• [14] Chan, H. P. and Lai, T. L. (2007). Efficient importance sampling for Monte Carlo evaluation of exceedance probabilities. Ann. Appl. Probab. 17 440–473.
• [15] Chan, H. P. and Lai, T. L. (2011). A sequential Monte Carlo approach to computing tail probabilities in stochastic models. Ann. Appl. Probab. 21 2315–2342.
• [16] Chatterjee, S. and Diaconis, P. (2013). Estimating and understanding exponential random graph models. Ann. Statist. 41 2428–2461.
• [17] Chen, Y., Diaconis, P., Holmes, S. P. and Liu, J. S. (2005). Sequential Monte Carlo methods for statistical analysis of tables. J. Amer. Statist. Assoc. 100 109–120.
• [18] Chen, Y. and Liu, J. S. (2007). Sequential Monte Carlo methods for permutation tests on truncated data. Statist. Sinica 17 857–872.
• [19] Del Moral, P. (2004). Feynman–Kac Formulae: Genealogical and Interacting Particle Systems with Applications. Springer, New York.
• [20] Del Moral, P. (2013). Mean Field Simulation for Monte Carlo Integration. CRC Press, Boca Raton, FL.
• [21] Del Moral, P., Kohn, R. and Patras, F. (2015). A duality formula for Feynman–Kac path particle models. C. R. Math. Acad. Sci. Paris 353 465–469.
• [22] Diaconis, P. and Zabell, S. (1991). Closed form summation for classical distributions: Variations on a theme of de Moivre. Statist. Sci. 6 284–302.
• [23] Doucet, A., de Freitas, N. and Gordon, N., eds. (2001). Sequential Monte Carlo Methods in Practice. Springer, New York.
• [24] Dupuis, P., Spiliopoulos, K. and Wang, H. (2012). Importance sampling for multiscale diffusions. Multiscale Model. Simul. 10 1–27.
• [25] Dupuis, P. and Wang, H. (2004). Importance sampling, large deviations, and differential games. Stoch. Stoch. Rep. 76 481–508.
• [26] Efron, B. (2012). Bayesian inference and the parametric bootstrap. Ann. Appl. Stat. 6 1971–1997.
• [27] Freer, C. E., Mansinghka, V. K. and Roy, D. M. (2010). When are probabilistic programs probably computationally tractable? Presented at the NIPS Workshop on Monte Carlo Methods for Modern Applications, 2010. Available at http://danroy.org/papers/FreerManRoy-NIPSMC-2010.pdf.
• [28] Gelman, A. and Meng, X.-L. (1998). Simulating normalizing constants: From importance sampling to bridge sampling to path sampling. Statist. Sci. 13 163–185.
• [29] Hammersley, J. M. and Handscomb, D. C. (1965). Monte Carlo Methods. Methuen & Co., Ltd., London.
• [30] Hesterberg, T. (1995). Weighted average importance sampling and defensive mixture distributions. Technometrics 37 185–194.
• [31] Huggins, J. H. and Roy, D. M. (2015). Convergence of sequential Monte Carlo-based sampling methods. Preprint. Available at arXiv:1503.00966.
• [32] Hult, H. and Nyquist, P. (2016). Large deviations for weighted empirical measures arising in importance sampling. Stochastic Process. Appl. 126 138–170.
• [33] Kahn, H. and Marshall, A. W. (1953). Methods of reducing sample size in Monte Carlo computations. J. Oper. Res. Soc. Am. 1 263–278.
• [34] Kenyon, R., Kral, D., Radin, C. and Winkler, P. (2015). A variational principle for permutations. Preprint. Available at arXiv:1506.02340.
• [35] Kenyon, R., Radin, C., Ren, K. and Sadun, L. (2014). Multipodal structure and phase transitions in large constrained graphs. Preprint. Available at arXiv:1405.0599.
• [36] Kenyon, R. and Yin, M. (2014). On the asymptotics of constrained exponential random graphs. Preprint. Available at arXiv:1406.3662.
• [37] Knuth, D. E. (1976). Mathematics and computer science: Coping with finiteness. Science 194 1235–1242.
• [38] Knuth, D. E. (1996). Selected Papers on Computer Science. CSLI Lecture Notes 59. CSLI Publications, Stanford, CA; Cambridge University Press, Cambridge.
• [39] Lelièvre, T., Rousset, M. and Stoltz, G. (2010). Free Energy Computations: A Mathematical Perspective. World Scientific, Singapore.
• [40] Liu, J. S. (2008). Monte Carlo Strategies in Scientific Computing. Springer, New York.
• [41] Liu, J. S. and Chen, R. (1995). Blind deconvolution via sequential imputations. J. Amer. Statist. Assoc. 90 567–576.
• [42] Madras, N. (1998). Umbrella sampling and simulated tempering. In Numerical Methods for Polymeric Systems (Minneapolis, MN, 1996). IMA Vol. Math. Appl. 102 19–32. Springer, New York.
• [43] Madras, N. and Piccioni, M. (1999). Importance sampling for families of distributions. Ann. Appl. Probab. 9 1202–1225.
• [44] McCoy, B. M. (2010). Advanced Statistical Mechanics. International Series of Monographs on Physics 146. Oxford Univ. Press, Oxford.
• [45] Mukherjee, S. (2013). Estimation in exponential families on permutations. Preprint. Available at arXiv:1307.0978.
• [46] Naiman, D. Q. and Wynn, H. P. (1997). Abstract tubes, improved inclusion-exclusion identities and inequalities and importance sampling. Ann. Statist. 25 1954–1983.
• [47] Owen, A. and Zhou, Y. (1999). Adaptive importance sampling by mixtures of products of beta distributions. Technical report No. 1999–25, Dept. Statistics, Stanford Univ., Stanford, CA.
• [48] Owen, A. and Zhou, Y. (2000). Safe and effective importance sampling. J. Amer. Statist. Assoc. 95 135–143.
• [49] Owen, A. B. (2005). Multidimensional variation for quasi-Monte Carlo. In Contemporary Multivariate Analysis and Design of Experiments. Ser. Biostat. 2 49–74. World Sci. Publ., Hackensack, NJ.
• [50] Owen, A. B. (2006). Quasi-Monte Carlo for integrands with point singularities at unknown locations. In Monte Carlo and Quasi-Monte Carlo Methods 2004 403–417. Springer, Berlin.
• [51] Robert, C. P. and Casella, G. (2004). Monte Carlo Statistical Methods, 2nd ed. Springer, New York.
• [52] Rosenbluth, M. N. and Rosenbluth, A. W. (1955). Monte Carlo calculation of the average extension of molecular chains. J. Chem. Phys. 23 356–359.
• [53] Shi, J., Siegmund, D. and Yakir, B. (2007). Importance sampling for estimating $p$ values in linkage analysis. J. Amer. Statist. Assoc. 102 929–937.
• [54] Siegmund, D. (1976). Importance sampling in the Monte Carlo study of sequential tests. Ann. Statist. 4 673–684.
• [55] Srinivasan, R. (2002). Importance Sampling: Applications in Communications and Detection. Springer, Berlin.
• [56] Starr, S. (2009). Thermodynamic limit for the Mallows model on $S_{n}$. J. Math. Phys. 50 095208.
• [57] Torrie, G. M. and Valleau, J. P. (1977). Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. J. Comput. Phys. 23 187–199.
• [58] Whiteley, N., Lee, A. and Heine, K. (2016). On the role of interaction in sequential Monte Carlo algorithms. Bernoulli 22 494–529.