Open Access
December 2016 A statistical model to assess (allele-specific) associations between gene expression and epigenetic features using sequencing data
Naim U. Rashid, Wei Sun, Joseph G. Ibrahim
Ann. Appl. Stat. 10(4): 2254-2273 (December 2016). DOI: 10.1214/16-AOAS973

Abstract

Sequencing techniques have been widely used to assess gene expression (i.e., RNA-seq) or the presence of epigenetic features (e.g., DNase-seq to identify open chromatin regions). In contrast to traditional microarray platforms, sequencing data are typically summarized in the form of discrete counts, and they are able to delineate allele-specific signals, which are not available from microarrays. The presence of epigenetic features are often associated with gene expression, both of which have been shown to be affected by DNA polymorphisms. However, joint models with the flexibility to assess interactions between gene expression, epigenetic features and DNA polymorphisms are currently lacking. In this paper, we develop a statistical model to assess the associations between gene expression and epigenetic features using sequencing data, while explicitly modeling the effects of DNA polymorphisms in either an allele-specific or nonallele-specific manner. We show that in doing so we provide the flexibility to detect associations between gene expression and epigenetic features, as well as conditional associations given DNA polymorphisms. We evaluate the performance of our method using simulations and apply our method to study the association between gene expression and the presence of DNase I Hypersensitive sites (DHSs) in HapMap individuals. Our model can be generalized to exploring the relationships between DNA polymorphisms and any two types of sequencing experiments, a useful feature as the variety of sequencing experiments continue to expand.

Citation

Download Citation

Naim U. Rashid. Wei Sun. Joseph G. Ibrahim. "A statistical model to assess (allele-specific) associations between gene expression and epigenetic features using sequencing data." Ann. Appl. Stat. 10 (4) 2254 - 2273, December 2016. https://doi.org/10.1214/16-AOAS973

Information

Received: 1 September 2014; Revised: 1 July 2016; Published: December 2016
First available in Project Euclid: 5 January 2017

zbMATH: 06688776
MathSciNet: MR3592056
Digital Object Identifier: 10.1214/16-AOAS973

Keywords: Bivariate binomial logistic-normal (BBLN) distribution , bivariate Poisson log-normal (BPLN) distribution , DNase-seq , Genetics , genomics , RNA-Seq

Rights: Copyright © 2016 Institute of Mathematical Statistics

Vol.10 • No. 4 • December 2016
Back to Top