Abstract
Sex difference in allele frequency is an emerging topic that is crucial to our understanding of data quality and features, particularly when it comes to the largely overlooked X chromosome. To detect sex differences in allele frequency for both X chromosomal and autosomal variants, the existing method is conservative when applied to samples from multiple ancestral populations. Additionally, it remains unexplored whether the sex difference in allele frequency varies between populations, which is important for transancestral genetic studies. To answer these questions, we thus developed a novel, retrospective regression-based testing framework that led to interpretable and easy-to-implement solutions. We then applied the proposed methods to the high-coverage whole genome sequence data of the 1000 Genomes Project, robustly analyzing all samples available from the five super-populations. We had 97 novel findings by recognizing and modelling ancestral differences. Finally, we replicated the specific findings and overall conclusion using the gnomAD v3.1.2 data.
Funding Statement
This research was funded by the Canadian Institutes of Health Research (CIHR, PJT-180460), the Natural Sciences and Engineering Research Council of Canada (NSERC, RGPIN-04934) and a University of Toronto Data Sciences Institute (DSI) Catalyst Grant.
Acknowledgments
We thank Dr. Lin Zhang for helpful discussions and the Associate Editor and two anonymous reviewers for their critical reviews of the work.
Citation
Zhong Wang. Andrew D. Paterson. Lei Sun. "A population-aware retrospective regression to detect genome-wide variants with sex difference in allele frequency." Ann. Appl. Stat. 18 (2) 1113 - 1136, June 2024. https://doi.org/10.1214/23-AOAS1825
Information