Translator Disclaimer
June 2012 Approximate sampling formulae for general finite-alleles models of mutation
Anand Bhaskar, John A. Kamm, Yun S. Song
Author Affiliations +
Adv. in Appl. Probab. 44(2): 408-428 (June 2012). DOI: 10.1239/aap/1339878718


Many applications in genetic analyses utilize sampling distributions, which describe the probability of observing a sample of DNA sequences randomly drawn from a population. In the one-locus case with special models of mutation, such as the infinite-alleles model or the finite-alleles parent-independent mutation model, closed-form sampling distributions under the coalescent have been known for many decades. However, no exact formula is currently known for more general models of mutation that are of biological interest. In this paper, models with finitely-many alleles are considered, and an urn construction related to the coalescent is used to derive approximate closed-form sampling formulae for an arbitrary irreducible recurrent mutation model or for a reversible recurrent mutation model, depending on whether the number of distinct observed allele types is at most three or four, respectively. It is demonstrated empirically that the formulae derived here are highly accurate when the per-base mutation rate is low, which holds for many biological organisms.


Download Citation

Anand Bhaskar. John A. Kamm. Yun S. Song. "Approximate sampling formulae for general finite-alleles models of mutation." Adv. in Appl. Probab. 44 (2) 408 - 428, June 2012.


Published: June 2012
First available in Project Euclid: 16 June 2012

zbMATH: 1241.92053
MathSciNet: MR2977402
Digital Object Identifier: 10.1239/aap/1339878718

Primary: 92D15
Secondary: 41A58 , 65C50 , 92D10

Keywords: coalescent theory , martingale , Sampling probability , urn model

Rights: Copyright © 2012 Applied Probability Trust


This article is only available to subscribers.
It is not available for individual sale.

Vol.44 • No. 2 • June 2012
Back to Top