Open Access
February 2015 Bayesian factor models for the detection of coherent patterns in gene expression data
Vinicius D. Mayrink, Joseph E. Lucas
Braz. J. Probab. Stat. 29(1): 1-33 (February 2015). DOI: 10.1214/13-BJPS226

Abstract

A common problem in the analysis of gene expression microarray data is the identification of groups of features that are coherently expressed. For example, one often wishes to know whether a group of genes, clustered because of correlation in one data set, are still highly co-expressed in another data set. Alternatively, for some expression array platforms there are many, relatively short probes for each gene of interest. In this case, it is possible that a given probe is not measuring its targeted gene, but rather a different gene with a similar region (called cross-hybridization). Accurate detection of the collection of probe sets (groups of probes targeting the same gene) which demonstrate highly coherent expression patterns is the best approach to the identification of which genes are present in the sample. We develop a Bayesian Factor Model (BFM) to address the general problem of detection of coherent patterns in gene expression data sets. We compare our method to “state of the art” methods for the identification of expressed genes in both synthetic and real data sets, and the results indicate that the BFM outperforms the other procedures for detecting transcripts. We also demonstrate the use of factor analysis to identify the presence/absence status of gene modules (groups of coherently expressed genes). Variation in the number of copies of regions of the genome is a well known and important feature of most cancers. We examine a group of genes, representative of Copy Number Alteration (CNA) in breast cancer, then identify the presence/absence of CNA in this region of the genome for other cancers. Coherent patterns can also be evaluated in high-throughput sequencing data, a novel technology to measure gene expression. We analyze this type of data via factor model and examine the detection calls in terms of read mapping uncertainty.

Citation

Download Citation

Vinicius D. Mayrink. Joseph E. Lucas. "Bayesian factor models for the detection of coherent patterns in gene expression data." Braz. J. Probab. Stat. 29 (1) 1 - 33, February 2015. https://doi.org/10.1214/13-BJPS226

Information

Published: February 2015
First available in Project Euclid: 30 October 2014

zbMATH: 1329.92089
MathSciNet: MR3299105
Digital Object Identifier: 10.1214/13-BJPS226

Keywords: Coherent , copy number alteration , detection call , factor model , high-throughput data , microarray

Rights: Copyright © 2015 Brazilian Statistical Association

Vol.29 • No. 1 • February 2015
Back to Top