## Statistical Science

- Statist. Sci.
- Volume 32, Number 3 (2017), 405-431.

### Importance Sampling: Intrinsic Dimension and Computational Cost

S. Agapiou, O. Papaspiliopoulos, D. Sanz-Alonso, and A. M. Stuart

**Full-text: Access denied (no subscription detected) **

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

#### Abstract

The basic idea of importance sampling is to use independent samples from a proposal measure in order to approximate expectations with respect to a target measure. It is key to understand how many samples are required in order to guarantee accurate approximations. Intuitively, some notion of distance between the target and the proposal should determine the computational cost of the method. A major challenge is to quantify this distance in terms of parameters or statistics that are pertinent for the practitioner. The subject has attracted substantial interest from within a variety of communities. The objective of this paper is to overview and unify the resulting literature by creating an overarching framework. A general theory is presented, with a focus on the use of importance sampling in Bayesian inverse problems and filtering.

#### Article information

**Source**

Statist. Sci. Volume 32, Number 3 (2017), 405-431.

**Dates**

First available in Project Euclid: 1 September 2017

**Permanent link to this document**

https://projecteuclid.org/euclid.ss/1504253124

**Digital Object Identifier**

doi:10.1214/17-STS611

**Keywords**

Importance sampling notions of dimension small noise absolute continuity inverse problems filtering

#### Citation

Agapiou, S.; Papaspiliopoulos, O.; Sanz-Alonso, D.; Stuart, A. M. Importance Sampling: Intrinsic Dimension and Computational Cost. Statist. Sci. 32 (2017), no. 3, 405--431. doi:10.1214/17-STS611. https://projecteuclid.org/euclid.ss/1504253124

#### References

- [1] Achutegui, K., Crisan, D., Miguez, J. and Rios, G. (2014). A simple scheme for the parallelization of particle filters and its application to the tracking of complex stochastic systems. Preprint. Available at arXiv:1407.8071.arXiv: 1407.8071
- [2] Agapiou, S., Larsson, S. and Stuart, A. M. (2013). Posterior contraction rates for the Bayesian approach to linear ill-posed inverse problems.
*Stochastic Process. Appl.***123**3828–3860. - [3] Agapiou, S. and Mathé, P. (2014). Preconditioning the prior to overcome saturation in Bayesian inverse problems. Preprint. Available at arXiv:1409.6496.arXiv: 1409.6496
- [4] Agapiou, S., Papaspiliopoulos, O., Sanz-Alonso, D. and Stuart, A. M. (2017). Supplement to “Importance sampling: Intrinsic dimension and computational cost.” DOI:10.1214/17-STS611SUPP.
- [5] Agapiou, S., Stuart, A. M. and Zhang, Y.-X. (2014). Bayesian posterior contraction rates for linear severely ill-posed inverse problems.
*J. Inverse Ill-Posed Probl.***22**297–321. - [6] Bain, A. and Crisan, D. (2009).
*Fundamentals of Stochastic Filtering***3**. Springer, Berlin. - [7] Bender, C. M. and Orszag, S. A. (1999).
*Advanced Mathematical Methods for Scientists and Engineers. I. Asymptotic Methods and Perturbation Theory*. Springer, New York. - [8] Bengtsson, T., Bickel, P., Li, B. et al. (2008). Curse-of-dimensionality revisited: Collapse of the particle filter in very large scale systems. In
*Probability and Statistics*:*Essays in Honor of David A. Freedman. Inst. Math. Stat.*(*IMS*)*Collect.*316–334. IMS, Beachwood, OH. - [9] Beskos, A., Crisan, D. and Jasra, A. (2014). On the stability of sequential Monte Carlo methods in high dimensions.
*Ann. Appl. Probab.***24**1396–1445. - [10] Beskos, A., Crisan, D., Jasra, A., Kamatani, K. and Zhou, Y. (2014). A stable particle filter in high-dimensions. Preprint. Available at 1412.3501.
- [11] Bickel, P., Li, B. and Bengtsson, T. (2008). Sharp failure rates for the bootstrap particle filter in high dimensions. In
*Pushing the Limits of Contemporary Statistics*:*Contributions in Honor of Jayanta K. Ghosh. Inst. Math. Stat.*(*IMS*)*Collect.***3**318–329. IMS, Beachwood, OH. - [12] Bishop, C. M. (2006).
*Pattern Recognition and Machine Learning*. Springer, New York. - [13] Boucheron, S., Lugosi, G. and Massart, P. (2013).
*Concentration Inequalities*. Oxford Univ. Press, Oxford. - [14] Bui-Thanh, T. and Ghattas, O. (2012). Analysis of the Hessian for inverse scattering problems: I. Inverse shape scattering of acoustic waves.
*Inverse Problems***28**055001, 32. - [15] Bui-Thanh, T., Ghattas, O., Martin, J. and Stadler, G. (2013). A computational framework for infinite-dimensional Bayesian inverse problems part I: The linearized case, with application to global seismic inversion.
*SIAM J. Sci. Comput.***35**A2494–A2523. - [16] Caflisch, R. E., Morokoff, W. J. and Owen, A. B. (1997).
*Valuation of Mortgage Backed Securities Using Brownian Bridges to Reduce Effective Dimension*. Dept. Mathematics, Univ. California, Los Angeles. - [17] Caponnetto, A. and De Vito, E. (2007). Optimal rates for the regularized least-squares algorithm.
*Found. Comput. Math.***7**331–368. - [18] Cavalier, L. and Tsybakov, A. (2002). Sharp adaptation for inverse problems with random noise.
*Probab. Theory Related Fields***123**323–354. - [19] Chatterjee, S. and Diaconis, P. (2015). The sample size required in importance sampling. Preprint. Available at arXiv:1511.01437.arXiv: 1511.01437
- [20] Chen, Y. (2005). Another look at rejection sampling through importance sampling.
*Statist. Probab. Lett.***72**277–283. - [21] Chopin, N. (2004). Central limit theorem for sequential Monte Carlo methods and its application to Bayesian inference.
*Ann. Statist.***32**2385–2411. - [22] Chopin, N. and Papaspiliopoulos, O. (2016). A Concise Introduction to Sequential Monte Carlo.
- [23] Chorin, A. J. and Morzfeld, M. (2013). Conditions for successful data assimilation.
*Journal of Geophysical Research*:*Atmospheres***118**11–522. - [24] Constantine, P. G. (2015).
*Active Subspaces*:*Emerging Ideas for Dimension Reduction in Parameter Studies***2**. SIAM Spotlights, Philadelphia, PA. - [25] Cotter, S. L., Roberts, G. O., Stuart, A. M. and White, D. (2013). MCMC methods for functions: Modifying old algorithms to make them faster.
*Statist. Sci.***28**424–446. - [26] Crisan, D., Del Moral, P. and Lyons, T. (1999). Discrete filtering using branching and interacting particle systems.
*Markov Process. Related Fields***5**293–318.Mathematical Reviews (MathSciNet): MR1710982 - [27] Crisan, D. and Doucet, A. (2002). A survey of convergence results on particle filtering methods for practitioners.
*IEEE Trans. Signal Process.***50**736–746. - [28] Crisan, D. and Rozovskiĭ, B., eds. (2011).
*The Oxford Handbook of Nonlinear Filtering*. Oxford Univ. Press, Oxford. - [29] Cui, T., Martin, J., Marzouk, Y. M., Solonen, A. and Spantini, A. (2014). Likelihood-informed dimension reduction for nonlinear inverse problems.
*Inverse Problems***30**114015, 28. - [30] Dashti, M. and Stuart, A. M. (2016). The Bayesian approach to inverse problems. In
*Handbook of Uncertainty Quantification*(R. Ghanem, D. Higdon and H. Owhadi, eds.). http://arxiv.org/abs/1302.6989. - [31] Del Moral, P. (2004).
*Feynman–Kac Formulae*. Springer, New York. - [32] Del Moral, P. and Miclo, L. (2000). Branching and interacting particle systems approximations of Feynman–Kac formulae with applications to non-linear filtering. In
*Séminaire de Probabilités*,*XXXIV. Lecture Notes in Math.***1729**1–145. Springer, Berlin. - [33] Doucet, A., de Freitas, N. and Gordon, N. (2001). An introduction to sequential Monte Carlo methods. In
*Sequential Monte Carlo Methods in Practice*. 3–14. Springer, New York. - [34] Doucet, A., Godsill, S. and Andrieu, C. (2000). On sequential Monte Carlo sampling methods for Bayesian filtering.
*Stat. Comput.***10**197–208. - [35] Doukhan, P. and Lang, G. (2009). Evaluation for moments of a ratio with application to regression estimation.
*Bernoulli***15**1259–1286. - [36] Downey, P. J. and Wright, P. E. (2007). The ratio of the extreme to the sum in a random sequence.
*Extremes***10**249–266. - [37] Dupuis, P., Spiliopoulos, K. and Wang, H. (2012). Importance sampling for multiscale diffusions.
*Multiscale Model. Simul.***10**1–27. - [38] Engl, H. W., Hanke, M. and Neubauer, A. (1996).
*Regularization of Inverse Problems. Mathematics and Its Applications***375**. Kluwer Academic, Dordrecht. - [39] Franklin, J. N. (1970). Well-posed stochastic extensions of ill-posed linear problems.
*J. Math. Anal. Appl.***31**682–716.Mathematical Reviews (MathSciNet): MR267654

Digital Object Identifier: doi:10.1016/0022-247X(70)90017-X - [40] Frei, M. and Künsch, H. R. (2013). Bridging the ensemble Kalman and particle filters.
*Biometrika***100**781–800. - [41] Furrer, R. and Bengtsson, T. (2007). Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants.
*J. Multivariate Anal.***98**227–255.Mathematical Reviews (MathSciNet): MR2301751

Digital Object Identifier: doi:10.1016/j.jmva.2006.08.003 - [42] Gelman, A., Roberts, G. O. and Gilks, W. R. (1996). Efficient Metropolis jumping rules. In
*Bayesian Statistics*, 5 (*Alicante*, 1994). 599–607. Oxford Univ. Press, New York. - [43] Gibbs, A. L. and Su, F. E. (2002). On choosing and bounding probability metrics.
*Int. Stat. Rev.***70**419–435. - [44] Goodman, J., Lin, K. K. and Morzfeld, M. (2015). Small-noise analysis and symmetrization of implicit Monte Carlo samplers.
*Comm. Pure Appl. Math.* - [45] Han, W. (2013). On the Numerical Solution of the Filtering Problem Ph.D. thesis, Dept. Mathematics, Imperial College, London.
- [46] Johansen, A. M. and Doucet, A. (2008). A note on auxiliary particle filters.
*Statist. Probab. Lett.***78**1498–1504. - [47] Kahn, H. (1955).
*Use of Different Monte Carlo Sampling Techniques*. Rand Corporation, Santa Monica, CA. - [48] Kahn, H. and Marshall, A. W. (1953). Methods of reducing sample size in Monte Carlo computations.
*Journal of the Operations Research Society of America***1**263–278. - [49] Kaipio, J. and Somersalo, E. (2005).
*Statistical and Computational Inverse Problems. Applied Mathematical Sciences***160**. Springer, New York. - [50] Kallenberg, O. (2002).
*Foundations of Modern Probability. Probability and Its Applications*, 2nd ed. Springer, Berlin. - [51] Kalman, R. E. (1960). A new approach to linear filtering and prediction problems.
*J. Fluids Eng.***82**35–45. - [52] Kalnay, E. (2003).
*Atmospheric Modeling*,*Data Assimilation*,*and Predictability*. Cambridge Univ. Press, Cambridge. - [53] Kekkonen, H., Lassas, M. and Siltanen, S. (2016). Posterior consistency and convergence rates for Bayesian inversion with hypoelliptic operators.
*Inverse Probl.***32**085005, 31.Mathematical Reviews (MathSciNet): MR3535664 - [54] Knapik, B. T., van Der Vaart, A. W. and van Zanten, J. H. (2011). Bayesian inverse problems with Gaussian priors.
*Ann. Statist.***39**2626–2657. - [55] Knapik, B. T., van der Vaart, A. W. and van Zanten, J. H. (2013). Bayesian recovery of the initial condition for the heat equation.
*Comm. Statist. Theory Methods***42**1294–1313.Mathematical Reviews (MathSciNet): MR3031282

Digital Object Identifier: doi:10.1080/03610926.2012.681417 - [56] Kong, A. (1992). A note on importance sampling using standardized weights. Technical Report 348, Dept. Statistics, Univ. Chicago, Chicago, IL.
- [57] Kong, A., Liu, J. S. and Wong, W. H. (1994). Sequential imputations and Bayesian missing data problems.
*J. Amer. Statist. Assoc.***89**278–288. - [58] Kuo, F. Y. and Sloan, I. H. (2005). Lifting the curse of dimensionality.
*Notices Amer. Math. Soc.***52**1320–1329. - [59] Lancaster, P. and Rodman, L. (1995).
*Algebraic Riccati Equations*. Oxford Univ. Press, London. - [60] Lasanen, S. (2007). Measurements and infinite-dimensional statistical inverse theory.
*PAMM***7**1080101–1080102. - [61] Lasanen, S. (2012). Non-Gaussian statistical inverse problems. Part I: Posterior distributions.
*Inverse Probl. Imaging***6**215–266. - [62] Lasanen, S. (2012). Non-Gaussian statistical inverse problems. Part II: Posterior convergence for approximated unknowns.
*Inverse Probl. Imaging***6**267–287. - [63] Lax, P. D. (2002).
*Functional Analysis*. Wiley, New York. - [64] Lehtinen, M. S., Paivarinta, L. and Somersalo, E. (1989). Linear inverse problems for generalised random variables.
*Inverse Probl.***5**599. - [65] Le Gland, F., Monbet, V. and Tran, V. (2009). Large sample asymptotics for the ensemble Kalman filter Ph.D. thesis, INRIA.
- [66] Li, W., Tan, Z. and Chen, R. (2013). Two-stage importance sampling with mixture proposals.
*J. Amer. Statist. Assoc.***108**1350–1365. - [67] Liang, F., Liu, C. and Carroll, R. J. (2007). Stochastic approximation in Monte Carlo computation.
*J. Amer. Statist. Assoc.***102**305–320. - [68] Lin, K., Lu, S. and Mathé, P. (2015). Oracle-type posterior contraction rates in Bayesian inverse problems.
*Inverse Probl. Imaging***9**. - [69] Lindley, D. V. and Smith, A. F. M. (1972). Bayes estimates for the linear model.
*J. R. Stat. Soc. Ser. B. Stat. Methodol.***34**1–41. - [70] Liu, J. S. (1996). Metropolized independent sampling with comparisons to rejection sampling and importance sampling.
*Stat. Comput.***6**113–119. - [71] Liu, J. S. (2008).
*Monte Carlo Strategies in Scientific Computing*. Springer, New York. - [72] Liu, J. S. and Chen, R. (1998). Sequential Monte Carlo methods for dynamic systems.
*J. Amer. Statist. Assoc.***93**1032–1044. - [73] Lu, S. and Mathé, P. (2014). Discrepancy based model selection in statistical inverse problems.
*J. Complexity***30**290–308. - [74] Mandelbaum, A. (1984). Linear estimators and measurable linear transformations on a Hilbert space.
*Z. Wahrsch. Verw. Gebiete***65**385–397. - [75] Martin, J., Wilcox, L. C., Burstedde, C. and Ghattas, O. (2012). A stochastic Newton MCMC method for large-scale statistical inverse problems with application to seismic inversion.
*SIAM J. Sci. Comput.***34**A1460–A1487. - [76] McLeish, D. L. and O’Brien, G. L. (1982). The expected ratio of the sum of squares to the square of the sum.
*Ann. Probab.***10**1019–1028. - [77] Míguez, J., Crisan, D. and Djurić, P. M. (2013). On the convergence of two sequential Monte Carlo methods for maximum a posteriori sequence estimation and stochastic global optimization.
*Stat. Comput.***23**91–107. - [78] Morzfeld, M., Tu, X., Wilkening, J. and Chorin, A. J. (2015). Parameter estimation by implicit sampling.
*Commun. Appl. Math. Comput. Sci.***10**205–225. - [79] Moskowitz, B. and Caflisch, R. E. (1996). Smoothness and dimension reduction in quasi-Monte Carlo methods.
*Math. Comput. Modelling***23**37–54. - [80] Mueller, J. L. and Siltanen, S. (2012).
*Linear and Nonlinear Inverse Problems with Practical Applications. Computational Science & Engineering***10**. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA. - [81] Oliver, D. S., Reynolds, A. C. and Liu, N. (2008).
*Inverse Theory for Petroleum Reservoir Characterization and History Matching*. Cambridge Univ. Press, Cambridge. - [82] Owen, A. and Zhou, Y. (2000). Safe and effective importance sampling.
*J. Amer. Statist. Assoc.***95**135–143. - [83] Pitt, M. K. and Shephard, N. (1999). Filtering via simulation: Auxiliary particle filters.
*J. Amer. Statist. Assoc.***94**590–599. - [84] Ray, K. (2013). Bayesian inverse problems with non-conjugate priors.
*Electron. J. Stat.***7**2516–2549. - [85] Rebeschini, P. and van Handel, R. (2015). Can local particle filters beat the curse of dimensionality?
*Ann. Appl. Probab.***25**2809–2866. - [86] Ren, Y.-F. and Liang, H.-Y. (2001). On the best constant in Marcinkiewicz–Zygmund inequality.
*Statist. Probab. Lett.***53**227–233. - [87] Sanz-Alonso, D. (2016). Importance sampling and necessary sample size: An information theory approach. Preprint. Available at arXiv:1608.08814.arXiv: 1608.08814
- [88] Slivinski, L. and Snyder, C. Practical estimates of the ensemble size necessary for particle filters.
- [89] Snyder, C. (2011). Particle filters, the “optimal” proposal and high-dimensional systems. In
*Proceedings of the ECMWF Seminar on Data Assimilation for Atmosphere and Ocean*. - [90] Snyder, C., Bengtsson, T., Bickel, P. and Anderson, J. (2008). Obstacles to high-dimensional particle filtering.
*Mon. Weather Rev.***136**4629–4640. - [91] Snyder, C., Bengtsson, T. and Morzfeld, M. (2015). Performance bounds for particle filters using the optimal proposal.
*Mon. Weather Rev.***143**4750–4761. - [92] Spantini, A., Solonen, A., Cui, T., Martin, J., Tenorio, L. and Marzouk, Y. (2015). Optimal low-rank approximations of Bayesian linear inverse problems.
*SIAM J. Sci. Comput.***37**A2451–A2487. - [93] Spiegelhalter, D. J., Best, N. G., Carlin, B. P. and van der Linde, A. (2002). Bayesian measures of model complexity and fit.
*J. R. Stat. Soc. Ser. B. Stat. Methodol.***64**583–639. - [94] Spiliopoulos, K. (2013). Large deviations and importance sampling for systems of slow-fast motion.
*Appl. Math. Optim.***67**123–161. - [95] Stuart, A. M. (2010). Inverse problems: A Bayesian perspective.
*Acta Numer.***19**451–559. - [96] Tan, Z. (2004). On a likelihood approach for Monte Carlo integration.
*J. Amer. Statist. Assoc.***99**1027–1036. - [97] Tierney, L. (1998). A note on Metropolis–Hastings kernels for general state spaces.
*Ann. Appl. Probab.***8**1–9. - [98] van Leeuwen, P. J. (2010). Nonlinear data assimilation in geosciences: An extremely efficient particle filter.
*Q. J. R. Meteorol. Soc.***136**1991–1999. - [99] Vanden-Eijnden, E. and Weare, J. (2012). Data assimilation in the low noise, accurate observation regime with application to the Kuroshio current.
*Mon. Weather Rev.***141**. - [100] Vollmer, S. J. (2013). Posterior consistency for Bayesian inverse problems through stability and regression results.
*Inverse Probl.***29**125011. - [101] Whiteley, N., Lee, A. and Heine, K. (2016). On the role of interaction in sequential Monte Carlo algorithms.
*Bernoulli***22**494–529. - [102] Zhang, T. (2002). Effective dimension and generalization of kernel learning. In
*Advances in Neural Information Processing Systems*454–461. - [103] Zhang, T. (2005). Learning bounds for kernel regression using effective data dimensionality.
*Neural Comput.***17**2077–2098. - [104] Zhang, W., Hartmann, C., Weber, M. and Schütte, C. (2013). Importance sampling in path space for diffusion processes.
*Multiscale Model. Simul.*

#### Supplemental materials

- Supplement to “Importance sampling: Intrinsic dimension and computational cost”. The Supplementary Material contains the proofs of all our results. It also contains some background on Gaussian measures on Hilbert spaces, and related technical aspects arising from considering measures in infinite dimensional spaces.Digital Object Identifier: doi:10.1214/17-STS611SUPPSupplemental files are immediately available to subscribers. Non-subscribers gain access to supplemental files with the purchase of the article.

### More like this

- Merging MCMC Subposteriors through Gaussian-Process Approximations

Nemeth, Christopher and Sherlock, Chris, Bayesian Analysis, 2018 - Statistical consistency of the data association problem in multiple target tracking

Storlie, Curtis B., Hannig, Jan, and Lee, Thomas C.M., Electronic Journal of Statistics, 2011 - Quantifying intrinsic and extrinsic noise in gene transcription using the linear noise approximation: An application to single cell data

Finkenstädt, Bärbel, Woodcock, Dan J., Komorowski, Michal, Harper, Claire V., Davis, Julian R. E., White, Mike R. H., and Rand, David A., The Annals of Applied Statistics, 2013

- Merging MCMC Subposteriors through Gaussian-Process Approximations

Nemeth, Christopher and Sherlock, Chris, Bayesian Analysis, 2018 - Statistical consistency of the data association problem in multiple target tracking

Storlie, Curtis B., Hannig, Jan, and Lee, Thomas C.M., Electronic Journal of Statistics, 2011 - Quantifying intrinsic and extrinsic noise in gene transcription using the linear noise approximation: An application to single cell data

Finkenstädt, Bärbel, Woodcock, Dan J., Komorowski, Michal, Harper, Claire V., Davis, Julian R. E., White, Mike R. H., and Rand, David A., The Annals of Applied Statistics, 2013 - Computational methods for parameter estimation in climate models

Huerta, Gabriel, Jackson, Charles S., Sen, Mrinal K., and Villagran, Alejandro, Bayesian Analysis, 2008 - The explicit form of expectation propagation for a simple statistical model

Kim, Andy S. I. and Wand, M. P., Electronic Journal of Statistics, 2016 - Particle-kernel estimation of the filter density in state-space models

Crisan, Dan and Míguez, Joaquín, Bernoulli, 2014 - High-dimensional data: p > > n in mathematical statistics and bio-medical applications

Van De Geer, Sara A. and Van Houwelingen, Hans C., Bernoulli, 2004 - Sequential Monte Carlo with Adaptive Weights for Approximate Bayesian Computation

Bonassi, Fernando V. and West, Mike, Bayesian Analysis, 2015 - Liquid chromatography mass spectrometry-based
proteomics: Biological and technological aspects

Karpievitch, Yuliya V., Polpitiya, Ashoka D., Anderson, Gordon A., Smith, Richard D., and Dabney, Alan R., The Annals of Applied Statistics, 2010 - Importance Sampling Schemes for Evidence Approximation in Mixture Models

Lee, Jeong Eun and Robert, Christian P., Bayesian Analysis, 2016