## The Annals of Applied Probability

### Exact simulation of the Wright–Fisher diffusion

#### Abstract

The Wright–Fisher family of diffusion processes is a widely used class of evolutionary models. However, simulation is difficult because there is no known closed-form formula for its transition function. In this article, we demonstrate that it is in fact possible to simulate exactly from a broad class of Wright–Fisher diffusion processes and their bridges. For those diffusions corresponding to reversible, neutral evolution, our key idea is to exploit an eigenfunction expansion of the transition function; this approach even applies to its infinite-dimensional analogue, the Fleming–Viot process. We then develop an exact rejection algorithm for processes with more general drift functions, including those modelling natural selection, using ideas from retrospective simulation. Our approach also yields methods for exact simulation of the moment dual of the Wright–Fisher diffusion, the ancestral process of an infinite-leaf Kingman coalescent tree. We believe our new perspective on diffusion simulation holds promise for other models admitting a transition eigenfunction expansion.

#### Article information

Source
Ann. Appl. Probab., Volume 27, Number 3 (2017), 1478-1509.

Dates
First available in Project Euclid: 19 July 2017

https://projecteuclid.org/euclid.aoap/1500451229

Digital Object Identifier
doi:10.1214/16-AAP1236

Mathematical Reviews number (MathSciNet)
MR3678477

Zentralblatt MATH identifier
1385.65006

#### Citation

Jenkins, Paul A.; Spanò, Dario. Exact simulation of the Wright–Fisher diffusion. Ann. Appl. Probab. 27 (2017), no. 3, 1478--1509. doi:10.1214/16-AAP1236. https://projecteuclid.org/euclid.aoap/1500451229

#### References

• Barbour, A. D., Ethier, S. N. and Griffiths, R. C. (2000). A transition function expansion for a diffusion model with selection. Ann. Appl. Probab. 10 123–162.
• Beskos, A., Papaspiliopoulos, O. and Roberts, G. O. (2006). Retrospective exact simulation of diffusion sample paths with applications. Bernoulli 12 1077–1098.
• Beskos, A., Papaspiliopoulos, O. and Roberts, G. O. (2008). A factorisation of diffusion measure and finite sample path constructions. Methodol. Comput. Appl. Probab. 10 85–104.
• Beskos, A. and Roberts, G. O. (2005). Exact simulation of diffusions. Ann. Appl. Probab. 15 2422–2444.
• Bladt, M., Finch, S. and Sørensen, M. (2016). Simulation of multivariate diffusion bridges. J. R. Stat. Soc. Ser. B. Stat. Methodol. 78 343–369.
• Bladt, M. and Sørensen, M. (2014). Simple simulation of diffusion bridges with application to likelihood inference for diffusions. Bernoulli 20 645–675.
• Bollback, J. P., York, T. L. and Nielsen, R. (2008). Estimation of $2N_{e}s$ from temporal allele frequency data. Genetics 179 497–502.
• Chaleyat-Maurel, M. and Genon-Catalot, V. (2009). Filtering the Wright–Fisher diffusion. ESAIM Probab. Stat. 13 197–217.
• Chib, S., Pitt, M. and Shephard, N. (2010). Likelihood based inference for diffusion driven state space models. Working Paper.
• Coop, G. and Griffiths, R. C. (2004). Ancestral inference on gene trees under selection. Theor. Popul. Biol. 66 219–232.
• Dangerfield, C. E., Kay, D. and Burrage, K. (2010). Stochastic models and simulation of ion channel dynamics. Proc. Comput. Sci. 1 1581–1590.
• Dangerfield, C. E., Kay, D., MacNamara, S. and Burrage, K. (2012). A boundary preserving numerical algorithm for the Wright–Fisher model with mutation. BIT 52 283–304.
• Dawson, D. A. (1978). Geostochastic calculus. Canad. J. Statist. 6 143–168.
• Delbaen, F. and Shirakawa, H. (2002). An interest rate model with upper and lower bounds. Asia-Pac. Financ. Mark. 9 191–209.
• Devroye, L. (1986). Nonuniform Random Variate Generation. Springer, New York.
• Ethier, S. N. and Griffiths, R. C. (1993). The transition function of a Fleming-Viot process. Ann. Probab. 21 1571–1590.
• Ethier, S. N. and Kurtz, T. G. (1993). Fleming–Viot processes in population genetics. SIAM J. Control Optim. 31 345–386.
• Favaro, S., Ruggiero, M. and Walker, S. G. (2009). On a Gibbs sampler based random process in Bayesian nonparametrics. Electron. J. Stat. 3 1556–1566.
• Fitzsimmons, P., Pitman, J. and Yor, M. (1993). Markovian bridges: Construction, Palm interpretation, and splicing. In Seminar on Stochastic Processes, 1992 (Seattle, WA, 1992). Progress in Probability 33 101–134. Birkhäuser, Boston, MA.
• Golightly, A. and Wilkinson, D. J. (2006). Bayesian sequential inference for nonlinear multivariate diffusions. Stat. Comput. 16 323–338.
• Golightly, A. and Wilkinson, D. J. (2008). Bayesian inference for nonlinear multivariate diffusion models observed with error. Comput. Statist. Data Anal. 52 1674–1693.
• Gourieroux, C. and Jasiak, J. (2006). Multivariate Jacobi process with application to smooth transitions. J. Econometrics 131 475–505.
• Griffiths, R. C. (1979). A transition density expansion for a multi-allele diffusion model. Adv. in Appl. Probab. 11 310–325.
• Griffiths, R. C. (1980). Lines of descent in the diffusion approximation of neutral Wright–Fisher models. Theor. Popul. Biol. 17 37–50.
• Griffiths, R. C. (1984). Asymptotic line-of-descent distributions. J. Math. Biol. 21 67–75.
• Griffiths, R. C. (2006). Coalescent lineage distributions. Adv. in Appl. Probab. 38 405–429.
• Griffiths, R. C. and Li, W. H. (1983). Simulating allele frequencies in a population and the genetic differentiation of populations under mutation pressure. Theor. Popul. Biol. 23 19–33.
• Griffiths, R. C. and Spanò, D. (2010). Diffusion processes and coalescent trees. In Probability and Mathematical Genetics, Papers in Honour of Sir John Kingman. (N. H. Bingham and C. M. Goldie, eds.). London Mathematical Society Lecture Note Series 378 358–379. Cambridge Univ. Press, Cambridge.
• Griffiths, R. C. and Spanò, D. (2013). Orthogonal polynomial kernels and canonical correlations for Dirichlet measures. Bernoulli 19 548–598.
• Gutenkunst, R. N., Hernandez, R. D., Williamson, S. H. and Bustamante, C. D. (2009). Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet. 5 e1000695.
• Jenkins, P. A. (2013). Exact simulation of the sample paths of a diffusion with a finite entrance boundary. Preprint. Available at arXiv:1311.5777.
• Jewett, E. M. and Rosenberg, N. A. (2014). Theory and applications of a deterministic approximation to the coalescent model. Theor. Popul. Biol. 93 14–29.
• Karlin, S. and Taylor, H. M. (1981). A Second Course in Stochastic Processes. Academic Press, New York.
• Kloeden, P. E. and Platen, E. (1999). Numerical Solution of Stochastic Differential Equations, 3rd printing. Springer, Berlin.
• Linetsky, V. (2005). On the transition densities for reflected diffusions. Adv. in Appl. Probab. 37 435–460.
• Lukić, S., Hey, J. and Chen, K. (2011). Non-equilibrium allele frequency spectra via spectral methods. Theor. Popul. Biol. 79 203–219.
• Malaspinas, A. S., Malaspinas, O., Evans, S. N. and Slatkin, M. (2012). Estimating allele age and selection coefficient from time-serial data. Genetics 192 599–607.
• Mena, R. H. and Ruggiero, M. (2016). Dynamic density estimation with diffusive Dirichlet mixtures. Bernoulli 22 901–926.
• Neuenkirch, A. and Szpruch, L. (2014). First order strong approximations of scalar SDEs defined in a domain. Numer. Math. 128 103–136.
• Øksendal, B. (2003). Stochastic Differential Equations. An Introduction with Applications, 6th ed. Springer, Berlin.
• Papaspiliopoulos, O. and Roberts, G. O. (2008). Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models. Biometrika 95 169–186.
• Papaspiliopoulos, O. and Ruggiero, M. (2014). Optimal filtering and the dual process. Bernoulli 20 1999–2019.
• Papaspiliopoulos, O., Ruggiero, M. and Spanò, D. (2014). Filtering hidden Markov measures. Preprint. Available at arXiv:1411.4944.
• Pardoux, E. (2009). Probabilistic models of population genetics. Lecture Notes, available from http://www.latp.univ-mrs.fr/~pardoux/enseignement/cours_genpop.pdf.
• Pitman, J. and Yor, M. (1981). Bessel processes and infinitely divisible laws. In Stochastic Integrals (Proc. Sympos., Univ. Durham, Durham, 1980) (D. Williams, ed.). Lecture Notes in Math. 851 285–370. Springer, Berlin.
• Pollock, M., Johansen, A. M. and Roberts, G. O. (2016). On the exact and $\varepsilon$-strong simulation of (jump) diffusions. Bernoulli 22 794–856.
• Schraiber, J., Griffiths, R. C. and Evans, S. N. (2013). Analysis and rejection sampling of Wright-Fisher diffusion bridges. Theor. Popul. Biol. 89 64–74.
• Schurz, H. (1996). Numerical regularization for SDEs: Construction of nonnegative solutions. Dynam. Systems Appl. 5 323–351.
• Song, Y. S. and Steinrücken, M. (2012). A simple method for finding explicit analytic transition densities of diffusion processes with general diploid selection. Genetics 190 1117–1129.
• Steinrücken, M., Wang, Y. X. R. and Song, Y. S. (2013). An explicit transition density expansion for a multi-allelic Wright-Fisher diffusion with general diploid selection. Theor. Popul. Biol. 83 1–14.
• Tavaré, S. (1984). Line-of-descent and genealogical processes, and their applications in population genetics models. Theor. Popul. Biol. 26 119–164.
• Walker, S. G. (2007). Sampling the Dirichlet mixture model with slices. Comm. Statist. Simulation Comput. 36 45–54.
• Walker, S. G., Hatjispyros, S. J. and Nicoleris, T. (2007). A Fleming–Viot process and Bayesian nonparametrics. Ann. Appl. Probab. 17 67–80.
• Williamson, S. H., Hernandez, R., Fledel-Alon, A., Zhu, L., Nielsen, R. and Bustamante, C. D. (2005). Simultaneous inference of selection and population growth from patterns of variation in the human genome. Proc. Natl. Acad. Sci. USA 102 7882–7887.
• Zhao, L., Lascoux, M., Overall, A. D. J. and Waxman, D. (2013). The characteristic trajectory of a fixing allele: A consequence of fictitious selection that arises from conditioning. Genetics 195 993–1006.