## The Annals of Applied Statistics

### Visualizing genetic constraints

#### Abstract

Principal Components Analysis (PCA) is a common way to study the sources of variation in a high-dimensional data set. Typically, the leading principal components are used to understand the variation in the data or to reduce the dimension of the data for subsequent analysis. The remaining principal components are ignored since they explain little of the variation in the data. However, evolutionary biologists gain important insights from these low variation directions. Specifically, they are interested in directions of low genetic variability that are biologically interpretable. These directions are called genetic constraints and indicate directions in which a trait cannot evolve through selection. Here, we propose studying the subspace spanned by low variance principal components by determining vectors in this subspace that are simplest. Our method and accompanying graphical displays enhance the biologist’s ability to visualize the subspace and identify interpretable directions of low genetic variability that align with simple directions.

#### Article information

Source
Ann. Appl. Stat., Volume 7, Number 2 (2013), 860-882.

Dates
First available in Project Euclid: 27 June 2013

https://projecteuclid.org/euclid.aoas/1372338471

Digital Object Identifier
doi:10.1214/12-AOAS603

Mathematical Reviews number (MathSciNet)
MR3113493

Zentralblatt MATH identifier
1288.62101

#### Citation

Gaydos, Travis L.; Heckman, Nancy E.; Kirkpatrick, Mark; Stinchcombe, J. R.; Schmitt, Johanna; Kingsolver, Joel; Marron, J. S. Visualizing genetic constraints. Ann. Appl. Stat. 7 (2013), no. 2, 860--882. doi:10.1214/12-AOAS603. https://projecteuclid.org/euclid.aoas/1372338471

#### References

• Adler, R. J. and Taylor, J. E. (2007). Random Fields and Geometry. Springer, New York.
• Amemiya, Y., Anderson, T. W. and Lewis, P. A. W. (1990). Percentage points for a test of rank in multivariate components of variance. Biometrika 77 637–641.
• Anderson, T. W. and Amemiya, Y. (1991). Testing dimensionality in the multivariate analysis of variance. Statist. Probab. Lett. 12 445–463.
• Beder, J. H. and Gomulkiewicz, R. (1998). Computing the selection gradient and evolutionary response of an infinite-dimensional trait. J. Math. Biol. 36 299–319.
• Chipman, H. A. and Gu, H. (2005). Interpretable dimension reduction. J. Appl. Stat. 32 969–987.
• Demidenko, E. (2004). Mixed Models: Theory and Applications. Wiley, Hoboken, NJ.
• Eilers, P. H. C. and Marx, B. D. (1996). Flexible smoothing with $B$-splines and penalties. Statist. Sci. 11 89–121.
• Gaydos, T. (2008). Data representation/basis selection to understand variation of function valued traits. Ph.D. thesis, Univ. North Carolina.
• Gaydos, T., Heckman, N., Kirkpatrick, M., Stinchcombe, J. R., Schmitt, J., Kingsolver, J. and Marron, J. S. (2013a). Supplement to “Visualizing genetic constraints.” DOI:10.1214/12-AOAS603SUPPA.
• Gaydos, T., Heckman, N., Kirkpatrick, M., Stinchcombe, J. R., Schmitt, J., Kingsolver, J. and Marron, J. S. (2013b). Supplement to “Visualizing genetic constraints.” DOI:10.1214/12-AOAS603SUPPB.
• Gomulkiewicz, R. and Beder, J. H. (1996). The selection gradient of an infinite-dimensional trait. SIAM J. Appl. Math. 56 509–523.
• Gomulkiewicz, R. and Houle, D. (2009). Demographic and genetic constraints on evolution. American Naturalist 174 218–229.
• Gomulkiewicz, R. and Kingsolver, J. G. (2006). A fable of four functions: Function-valued approaches in evolutionary biology. Journal of Evolutionary Biology 20 20–21.
• Green, P. J. and Silverman, B. W. (1994). Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach. Monographs on Statistics and Applied Probability 58. Chapman & Hall, London.
• Griswold, C. K., Gomulkiewicz, R. and Heckman, N. (2008). Hypothesis testing in comparative and experimental studies of function-valued traits. Evolution 62 1229–1242.
• Heckman, N. E. (2003). Functional data analysis in evolutionary biology. In Recent Advances and Trends in Nonparametric Statistics (M. G. Akritas and D. N. Politis, eds.) 49–60. Elsevier, Amsterdam.
• Hine, E. and Blows, M. W. (2006). Determining the effective dimensionality of the genetic variance–covariance matrix. Genetics 173 1135–1144.
• Izem, R. and Kingsolver, J. G. (2005). Variation in continuous reaction norms: Quantifying directions of biological interest. Am. Nat. 166 277–289.
• Johnson, R. A. and Wichern, D. W. (2008). Applied Multivariate Statistical Analysis, 6th ed. Pearson Education, Upper Saddle River.
• Kingsolver, J. G., Gomulkiewicz, R. and Carter, P. A. (2001). Variation, selection and evolution of function valued traits. Genetica 112–113 87–104.
• Kingsolver, J. G., Ragland, G. J. and Shlichta, J. G. (2004). Quantitative genetics of continuous reaction norms: Thermal sensitivity of caterpillar growth rates. Evolution 58 1521–1529.
• Kirkpatrick, M. and Heckman, N. (1989). A quantitative genetic model for growth, shape, reaction norms, and other infinite-dimensional characters. J. Math. Biol. 27 429–450.
• Kirkpatrick, M. and Lofsvold, D. (1992). Measuring selection and constraint in the evolution of growth. Evolution 46 954–971.
• Lande, R. (1976). Natural selection and random genetic drift in phenotypic evolution. Evolution 30 314–334.
• Lande, R. (1979). Quantitative genetic analysis of multivariate evolution, applied to brain: Body size allometry. Evolution 33 402–416.
• Lande, R. and Arnold, S. (1983). The measurement of selection on correlated characters. Evolution 37 1210–1226.
• Loève, M. (1978). Probability Theory. II, 4th ed. Graduate Texts in Mathematics 46. Springer, New York.
• Lynch, M. and Walsh, B. (1998). Genetic Analysis of Quantitative Traits. Sinauer, Sunderland, MA.
• Meyer, K. and Smith, S. (1996). Restricted maximum likelihood estimation for animal models using derivatives of the likelihood. Genetics Selection Evolution 28 23–49.
• Schatzman, M. (2002). Numerical Analysis: A Mathematical Introduction. Claredon Press, Oxford.
• Searle, S. R., Casella, G. and McCulloch, C. E. (2006). Variance Components. Wiley, Hoboken, NJ.
• Stewart, G. W. and Sun, J. G. (1990). Matrix Perturbation Theory. Academic Press, Boston, MA.
• Stinchcombe, J. R., Izem, R., Heschel, M. S., McGoey, B. V. and Schmitt, J. (2010). Across-environment genetic correlations and the frequency of selective environments shape the evolutionary dynamics of growth rate in Impatiens capensis. Evolution 64 2887–2903.
• Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 58 267–288.

#### Supplemental materials

• Supplementary material A: Supplementary plots. As previously noted, supplementary material [Gaydos et al. (2013a)] contains a complete set of plots from our data analyses, as in Figures 3 through 6.
• Supplementary material B: Nearly null space example. An additional supplementary file [Gaydos et al. (2013b)] contains a simple example that shows the benefits of the proposed methodology.