Annals of Statistics

Slice sampling

Radford M. Neal

Full-text: Open access

Abstract

Markov chain sampling methods that adapt to characteristics of the distribution being sampled can be constructed using the principle that one can ample from a distribution by sampling uniformly from the region under the plot of its density function. A Markov chain that converges to this uniform distribution can be constructed by alternating uniform sampling in the vertical direction with uniform sampling from the horizontal "slice" defined by the current vertical position, or more generally, with some update that leaves the uniform distribution over this slice invariant. Such "slice sampling" methods are easily implemented for univariate distributions, and can be used to sample from a multivariate distribution by updating each variable in turn. This approach is often easier to implement than Gibbs sampling and more efficient than simple Metropolis updates, due to the ability of slice sampling to adaptively choose the magnitude of changes made. It is therefore attractive for routine and automated use. Slice sampling methods that update all variables simultaneously are also possible. These methods can adaptively choose the magnitudes of changes made to each variable, based on the local properties of the density function. More ambitiously, such methods could potentially adapt to the dependencies between variables by constructing local quadratic approximations. Another approach is to improve sampling efficiency by suppressing random walks. This can be done for univariate slice sampling by "overrelaxation," and for multivariate slice sampling by "reflection" from the edges of the slice.

Article information

Source
Ann. Statist., Volume 31, Number 3 (2003), 705-767.

Dates
First available in Project Euclid: 25 June 2003

Permanent link to this document
https://projecteuclid.org/euclid.aos/1056562461

Digital Object Identifier
doi:10.1214/aos/1056562461

Mathematical Reviews number (MathSciNet)
MR1994729

Zentralblatt MATH identifier
1051.65007

Subjects
Primary: 65C60: Computational problems in statistics 65C05: Monte Carlo methods

Keywords
Markov chain Monte Carlo auxiliary variables adaptive methods Gibbs sampling Metropolis algorithm overrelaxation dynamical methods

Citation

Neal, Radford M. Slice sampling. Ann. Statist. 31 (2003), no. 3, 705--767. doi:10.1214/aos/1056562461. https://projecteuclid.org/euclid.aos/1056562461


Export citation

References

  • ADLER, S. L. (1981). Over-relaxation method for the Monte Carlo evaluation of the partition function for multiquadratic actions. Phy s. Rev. D 23 2901-2904.
  • BARONE, P. and FRIGESSI, A. (1990). Improving stochastic relaxation for Gaussian random fields. Probab. Engrg. Inform. Sci. 4 369-389.
  • BESAG, J. and GREEN, P. J. (1993). Spatial statistics and Bayesian computation (with discussion). J. Roy. Statist. Soc. Ser. B 55 25-37, 53-102.
  • CHEN, M.-H. and SCHMEISER, B. W. (1998). Toward black-box sampling: A random-direction interior-point Markov chain approach. J. Comput. Graph. Statist. 7 1-22.
  • DAMIEN, P., WAKEFIELD, J. C. and WALKER, S. G. (1999). Gibbs sampling for Bayesian nonconjugate and hierarchical models by using auxiliary variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 331-344.
  • DIACONIS, P., HOLMES, S. and NEAL, R. M. (2000). Analy sis of a non-reversible Markov chain sampler. Ann. Appl. Probab. 10 726-752.
  • DOWNS, O. B., MACKAY, D. J. C. and LEE, D. D. (2000). The nonnegative Boltzmann machine. In Advances in Neural Information Processing Sy stems (S. A. Solla, T. K. Leen and K.-R. Muller, eds.) 428-434. MIT Press, Cambridge, MA.
  • DUANE, S., KENNEDY, A. D., PENDLETON, B. J. and ROWETH, D. (1987). Hy brid Monte Carlo. Phy s. Lett. B 195 216-222.
  • EDWARDS, R. G. and SOKAL, A. D. (1988). Generalization of the Fortuin-Kasteley n-Swendsen- Wang representation and Monte Carlo algorithm. Phy s. Rev. D 38 2009-2012.
  • FREY, B. J. (1997). Continuous sigmoidal belief networks trained using slice sampling. In Advances in Neural Information Processing Sy stems (M. C. Mozer, M. I. Jordan and T. Petsche, eds.). MIT Press, Cambridge, MA.
  • GELFAND, A. E. and SMITH, A. F. M. (1990). Sampling-based approaches to calculating marginal densities. J. Amer. Statist. Assoc. 85 398-409.
  • GEy ER, C. J. and THOMPSON, E. A. (1995). Annealing Markov chain Monte Carlo with applications to ancestral inference. J. Amer. Statist. Assoc. 90 909-920.
  • GILKS, W. R. (1992). Derivative-free adaptive rejection sampling for Gibbs sampling. In Bayesian Statistics (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 641-649. Oxford Univ. Press.
  • GILKS, W. R., BEST, N. G. and TAN, K. K. C. (1995). Adaptive rejection Metropolis sampling within Gibbs sampling. Appl. Statist. 44 455-472.
  • GILKS, W. R., NEAL, R. M., BEST, N. G. and TAN, K. K. C. (1997). Corrigendum: Adaptive rejection Metropolis sampling. Appl. Statist. 46 541-542.
  • GILKS, W. R. and WILD, P. (1992). Adaptive rejection sampling for Gibbs sampling. Appl. Statist. 41 337-348.
  • GREEN, P. J. and HAN, X. (1992). Metropolis methods, Gaussian proposals and antithetic variables. Stochastic Models, Statistical Methods, and Algorithms in Image Analy sis (P. Barone et al., eds.). Lecture Notes in Statist. 74 142-164. Springer, New York.
  • GREEN, P. J. and MIRA, A. (2001). Delay ed rejection in reversible jump Metropolis-Hastings. Biometrika 88 1035-1053.
  • HASTINGS, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57 97-109.
  • HIGDON, D. M. (1996). Auxiliary variable methods for Markov chain Monte Carlo with applications. ISDS discussion paper 96-17, Duke Univ.
  • HOROWITZ, A. M. (1991). A generalized guided Monte Carlo algorithm. Phy s. Lett. B 268 247-252.
  • LUNN, D. J., THOMAS, A., BEST, N. and SPIEGELHALTER, D. (2000). WinBUGSa Bayesian modelling framework: Concepts, structure, and extensibility. Statist. Comput. 10 325-337.
  • METROPOLIS, N., ROSENBLUTH, A. W., ROSENBLUTH, M. N., TELLER, A. H. and TELLER, E.
  • (1953). Equation of state calculations by fast computing machines. J. Chem. Phy s. 21 1087-1092.
  • MIRA, A. (1998). Ordering, splicing and splitting Monte Carlo Markov chains. Ph.D. dissertation, School of Statistics, Univ. Minnesota.
  • MIRA, A. and TIERNEY, L. (2002). Efficiency and convergence properties of slice samplers. Scand. J. Statist. 29 1-12.
  • NEAL, R. M. (1994). An improved acceptance procedure for the hy brid Monte Carlo algorithm. J. Comput. Phy s. 111 194-203.
  • NEAL, R. M. (1996). Bayesian Learning for Neural Networks. Lecture Notes in Statist. 118. Springer, New York.
  • NEAL, R. M. (1998). Suppressing random walks in Markov chain Monte Carlo using ordered overrelaxation. In Learning in Graphical Models (M. I. Jordan, ed.) 205-228. Kluwer, Dordrecht.
  • NEAL, R. M. (2001). Annealed importance sampling. Statist. Comput. 11 125-139.
  • ROBERTS, G. O. and ROSENTHAL, J. S. (1999). Convergence of slice sampler Markov chains. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 643-660.
  • THOMAS, A., SPIEGELHALTER, D. J. and GILKS, W. R. (1992). BUGS: A program to perform Bayesian inference using Gibbs sampling. In Bayesian Statistics (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 837-842. Oxford Univ. Press.
  • TIERNEY, L. and MIRA, A. (1999). Some adaptive Monte Carlo methods for Bayesian inference. Statistics in Medicine 18 2507-2515.
  • Chen (1996).
  • CHEN, M.-H. and SCHMEISER, B. W. (1993). Performance of the Gibbs, hit-and-run, and Metropolis samplers. J. Comput. Graph. Statist. 2 251-272.
  • CHEN, M.-H. and SCHMEISER, B. W. (1998). Toward black-box sampling: A random-direction interior-point Markov chain approach. J. Comput. Graph. Statist. 7 1-22.
  • NANDRAM, B. and CHEN, M.-H. (1996). Reparameterizing the generalized linear model to accelerate Gibbs sampler convergence. J. Statist. Comput. Simulation 54 129-144.
  • KAUFMAN, D. E. and SMITH, R. L. (1998). Direction choice for accelerated convergence in hit-and-run sampling. Oper. Res. 46 84-95.
  • DOWNS, O. B. (2001). High-temperature expansions for learning models of nonnegative data. In Advances in Neural Information Processing Sy stems 13 (T. K. Leen, T. G. Dietterich and V. Tresp, eds.) 465-471. MIT Press, Cambridge, MA.
  • DOWNS, O. B., MACKAY, D. J. C. and LEE, D. D. (2000). The nonnegative Boltzmann machine. In Advances in Neural Information Processing Sy stems 12 (S. A. Solla, T. K. Leen and K.-R. Muller, eds.) 428-434. MIT Press, Cambridge, MA.
  • HINTON, G. E. and SEJNOWSKI, T. J. (1983). Optimal perceptual learning. In IEEE Conference on Computer Vision and Pattern Recognition 448-453. Washington.
  • KAPPEN, H. J. and RODRIGUEZ, F. B. (1998). Efficient learning in Boltzmann machines using linear response theory. Neural Computation 10 1137-1156.
  • LEE, D. D. and SEUNG, H. S. (1999). Learning the parts of objects by nonnegative matrix factorization. Nature 401 788-791.
  • MACKAY, D. J. C. (1998). Introduction to Monte Carlo methods. In Learning in Graphical Models (M. I. Jordan, ed.) 175-204. Kluwer, Dordrecht.
  • NEAL, R. M. (1997). Markov chain Monte Carlo methods based on "slicing" the density function. Technical Report 9722, Dept. Statistics, Univ. Toronto.
  • CASELLA, G., MENGERSEN, K. L., ROBERT, C. P. and TITTERINGTON, D. M. (2002). Perfect samplers for mixtures of distributions. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 777-790.
  • FILL, J. A. (1998). An interruptible algorithm for perfect sampling via Markov chains. Ann. Appl. Probab. 8 131-162.
  • GREEN, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82 711-732.
  • GREEN, P. J. and MIRA, A. (2001). Delay ed rejection in reversible jump Metropolis-Hastings. Biometrika 88 1035-1053.
  • KENDALL, W. S. and MØLLER, J. (2000). Perfect simulation using dominating processes on ordered spaces, with application to locally stable point processes. Adv. in Appl. Probab. 32 844-865.
  • MIRA, A. (1998). Ordering, slicing and splitting Monte Carlo Markov chains. Ph.D. dessertation, School of Statistics, Univ. of Minnesota.
  • MIRA, A. (2002). On Metropolis-Hastings algorithms with delay ed rejection. Metron 59 231-241.
  • MIRA, A., MØLLER, J. and ROBERTS, G. O. (2001). Perfect slice samplers. J. R. Stat. Soc. Ser. B Stat. Methodol. 63 593-606.
  • MIRA, A. and TIERNEY, L. (2002). Efficiency and convergence properties of slice samplers. Scand. J. Statist. 29 1-12.
  • PESKUN, P. H. (1973). Optimum Monte Carlo sampling using Markov chains. Biometrika 60 607-612.
  • PROPP, J. and WILSON, D. B. (1996). Exact sampling with coupled Markov chains and applications to statistical mechanics. Random Structures Algorithms 9 223-252.
  • ROBERTS, G. O. and ROSENTHAL, J. S. (1999). Convergence of slice sampler Markov chains. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 643-660.
  • ROBERTS, G. O. and ROSENTHAL, J. S. (2001). Markov chains and deinitialising processes. Scand. J. Statist. 28 489-505.
  • ROBERTS, G. O. and ROSENTHAL, J. S. (2002). The polar slice sampler. Stoch. Models 18 257-280.
  • ROBERTS, G. O. and TWEEDIE, R. L. (2000). Rates of convergence of stochastically monotone and continuous time Markov models. J. Appl. Probab. 37 359-373.
  • TIERNEY, L. and MIRA, A. (1999). Some adaptive Monte Carlo methods for Bayesian inference. Statistics in Medicine 18 2507-2515.
  • WILSON, D. B. (2000). How to couple from the past using a read-once source of randomness. Random Structures Algorithms 16 85-113.
  • DAMIEN, P., WAKEFIELD, J. C. and WALKER, S. G. (1999). Gibbs sampling for Bayesian nonconjugate and hierarchical models by using auxiliary variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 331-344.
  • ROBERTS, G. O. and ROSENTHAL, J. S. (1999). Convergence of slice sampler Markov chains. J. R. Statist. Soc. Ser. B 61 643-660.
  • CARACCIOLO, S., PELISSETTO, A. and SOKAL, A. D. (1994). A general limitation on Monte Carlo algorithms of Metropolis ty pe. Phy s. Rev. Lett. 72 179-182.
  • CHEN, M.-H. and SCHMEISER, B. W. (1998). Toward black-box sampling: A random-direction interior-point Markov chain approach. J. Comput. Graph. Statistics 7 1-22.
  • DAMIEN, P., WAKEFIELD, J. C. and WALKER, S. G. (1999). Gibbs sampling for Bayesian nonconjugate and hierarchical models by using auxiliary variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 331-344.
  • EDWARDS, R. G. and SOKAL, A. D. (1988). Generalization of the Fortuin-Kasteley n-Swendsen- Wang representation and Monte Carlo algorithm. Phy s. Rev. D 38 2009-2012.
  • GREEN, P. J. and MIRA, A. (2001). Delay ed rejection in reversible jump Metropolis-Hastings. Biometrika 88 1035-1053.
  • MIRA, A. (1998). Ordering, splicing and splitting Monte Carlo Markov chains. Ph.D. dissertation, School of Statistics, Univ. Minnesota.
  • NEAL, R. M. (1997). Markov chain Monte Carlo methods based on "slicing" the density function. Technical Report 9722, Dept. Statistics, Univ. Toronto.
  • ROBERTS, G. O. and ROSENTHAL, J. S. (2002). The polar slice sampler. Stoch. Models 18 257-280.
  • TIERNEY, L. and MIRA, A. (1999). Some adaptive Monte Carlo methods for Bayesian inference. Statistics in Medicine 18 2507-2515.

See also

  • Includes: Ming-Hui Chen, Bruce W. Schmeiser. Discussion.
  • Includes: Oliver B. Downs. Discussion.
  • Includes: Antonietta Mira, Gareth O. Roberts. Discussion.
  • Includes: John Skilling, David J. C. MacKay. Discussion.
  • Includes: S. G. Walker. Discussion.
  • Includes: Radford M. Neal. Rejoinder.