Open Access
2023 Nonparametric Bayes Differential Analysis of Multigroup DNA Methylation Data
Chiyu Gu, Veerabhadran Baladandayuthapani, Subharup Guha
Author Affiliations +
Bayesian Anal. Advance Publication 1-30 (2023). DOI: 10.1214/23-BA1407


DNA methylation datasets in cancer studies are comprised of measurements on a large number of genomic locations called cytosine-phosphate-guanine (CpG) sites with complex correlation structures. A fundamental goal of these studies is the development of statistical techniques that can identify disease genomic signatures across multiple patient groups defined by different experimental or biological conditions. We propose BayesDiff, a nonparametric Bayesian approach for differential analysis relying on a novel class of first order mixture models called the Sticky Pitman-Yor process or two-restaurant two-cuisine franchise (2R2CF). The BayesDiff methodology flexibly utilizes information from all CpG sites or biomarker probes, adaptively accommodates any serial dependence due to the widely varying inter-probe distances, and makes posterior inferences about the differential genomic signature of patient groups. Using simulation studies, we demonstrate the effectiveness of the BayesDiff procedure relative to existing statistical techniques for differential DNA methylation. The methodology is applied to analyze a gastrointestinal (GI) cancer dataset exhibiting serial correlation and complex interaction patterns. The results support and complement known aspects of DNA methylation and gene association in upper GI cancers.

Funding Statement

This work was supported by the National Science Foundation and National Institutes of Health under Grants DMS-1854003, R01 CA269398, and U01 CA209414 to SG, and Grants DMS-1463233 and R01 CA160736 to VB. The authors thank the anonymous Editor, Associate Editor, and two referees for many insightful comments that improved the content and presentation of the paper.


Download Citation

Chiyu Gu. Veerabhadran Baladandayuthapani. Subharup Guha. "Nonparametric Bayes Differential Analysis of Multigroup DNA Methylation Data." Bayesian Anal. Advance Publication 1 - 30, 2023.


Published: 2023
First available in Project Euclid: 23 November 2023

Digital Object Identifier: 10.1214/23-BA1407

Keywords: 2R2CF , first order models , Mixture models , sticky Pitman-Yor process , two-restaurant two-cuisine franchise

Advance Publication
Back to Top