## The Annals of Applied Statistics

- Ann. Appl. Stat.
- Volume 8, Number 4 (2014), 2203-2222.

### A novel spectral method for inferring general diploid selection from time series genetic data

Matthias Steinrücken, Anand Bhaskar, and Yun S. Song

#### Abstract

The increased availability of time series genetic variation data from experimental evolution studies and ancient DNA samples has created new opportunities to identify genomic regions under selective pressure and to estimate their associated fitness parameters. However, it is a challenging problem to compute the likelihood of nonneutral models for the population allele frequency dynamics, given the observed temporal DNA data. Here, we develop a novel spectral algorithm to analytically and efficiently integrate over all possible frequency trajectories between consecutive time points. This advance circumvents the limitations of existing methods which require fine-tuning the discretization of the population allele frequency space when numerically approximating requisite integrals. Furthermore, our method is flexible enough to handle general diploid models of selection where the heterozygote and homozygote fitness parameters can take any values, while previous methods focused on only a few restricted models of selection. We demonstrate the utility of our method on simulated data and also apply it to analyze ancient DNA data from genetic loci associated with coat coloration in horses. In contrast to previous studies, our exploration of the full fitness parameter space reveals that a heterozygote advantage form of balancing selection may have been acting on these loci.

#### Article information

**Source**

Ann. Appl. Stat., Volume 8, Number 4 (2014), 2203-2222.

**Dates**

First available in Project Euclid: 19 December 2014

**Permanent link to this document**

https://projecteuclid.org/euclid.aoas/1419001740

**Digital Object Identifier**

doi:10.1214/14-AOAS764

**Mathematical Reviews number (MathSciNet)**

MR3292494

**Zentralblatt MATH identifier**

06408775

**Keywords**

Population genetics spectral method transition density function hidden Markov model

#### Citation

Steinrücken, Matthias; Bhaskar, Anand; Song, Yun S. A novel spectral method for inferring general diploid selection from time series genetic data. Ann. Appl. Stat. 8 (2014), no. 4, 2203--2222. doi:10.1214/14-AOAS764. https://projecteuclid.org/euclid.aoas/1419001740

#### Supplemental materials

- Supplementary material: A novel spectral method for inferring general diploid selection from time series genetic data. We provide proofs of the results stated in Section 2. The modified Jacobi polynomials appearing in this paper are defined and some of their key properties are listed. Also, the coefficients in the definition of the matrix $\mathbf{M}$ in equation (2.14) are provided. Last, we describe some alternate density functions for the allele frequency at the time when selection arises.Digital Object Identifier: doi:10.1214/14-AOAS764SUPP