Open Access
December 2024 Scalable test of statistical significance for protein-DNA binding changes with insertion and deletion of bases in the genome
Qinyi Zhou, Chandler Zuo, Yuannyu Zhang, Min Chen, Jian Xu, Sunyoung Shin
Author Affiliations +
Ann. Appl. Stat. 18(4): 3528-3548 (December 2024). DOI: 10.1214/24-AOAS1950

Abstract

Mutations in the noncoding DNA, which represents approximately 99% of the human genome, have been crucial to understanding disease mechanisms through dysregulation of disease-associated genes. One key element in gene regulation that noncoding mutations mediate is the binding of proteins to DNA sequences. Insertion and deletion of bases (InDels) are the second most common type of mutations, following single nucleotide polymorphisms, that may impact protein-DNA binding. However, no existing methods can estimate and test the effects of InDels on the process of protein-DNA binding. We develop a novel test of statistical significance, namely, the binding change test (BC test), using a Markov model to evaluate the impact and identify InDels altering protein-DNA binding. The test predicts binding changer InDels of regulatory significance with an efficient importance sampling algorithm generating background sequences in favor of large binding affinity changes. Simulation studies demonstrate its excellent performance. The application to human leukemia data uncovers, in critical cis-regulatory elements, candidate pathological InDels on modulating TF binding in leukemic patients. We develop an R package atIndel, which is available on GitHub.

Funding Statement

Shin was supported in part by U.S. NSF Grant DMS-2113674, Korean NRF grant funded by the Korea government (MSIT) (RS-2023-00243012, RS-2023-00219980), POSTECH Basic Science Research Institute Fund (NRF-2021R1A6A1A10042944), and POSCO HOLDINGS grant 2023Q033.
Xu is a Scholar of The Leukemia & Lymphoma Society (LLS) and an American Society of Hematology (ASH) Scholar.

Acknowledgments

The authors are grateful to Dr. Michael Q. Zhang and Dr. Zhenyu Xuan at University of Texas at Dallas for helpful discussions.

Citation

Download Citation

Qinyi Zhou. Chandler Zuo. Yuannyu Zhang. Min Chen. Jian Xu. Sunyoung Shin. "Scalable test of statistical significance for protein-DNA binding changes with insertion and deletion of bases in the genome." Ann. Appl. Stat. 18 (4) 3528 - 3548, December 2024. https://doi.org/10.1214/24-AOAS1950

Information

Received: 1 March 2024; Revised: 1 August 2024; Published: December 2024
First available in Project Euclid: 31 October 2024

Digital Object Identifier: 10.1214/24-AOAS1950

Keywords: importance sampling , Noncoding mutations , p-value based test statistic , sequence-based models , test of significance , transcription factor binding

Rights: Copyright © 2024 Institute of Mathematical Statistics

Vol.18 • No. 4 • December 2024
Back to Top