The Annals of Applied Statistics

A statistical framework for data integration through graphical models with application to cancer genomics

Yuping Zhang, Zhengqing Ouyang, and Hongyu Zhao

Recent advances in high-throughput biotechnologies have generated various types of genetic, genomic, epigenetic, transcriptomic and proteomic data across different biological conditions. It is likely that integrating data from diverse experiments may lead to a more unified and global view of biological systems and complex diseases. We present a coherent statistical framework for integrating various types of data from distinct but related biological conditions through graphical models. Specifically, our statistical framework is designed for modeling multiple networks with shared regulatory mechanisms from heterogeneous high-dimensional datasets. The performance of our approach is illustrated through simulations and its applications to cancer genomics.

Article information

Ann. Appl. Stat. Volume 11, Number 1 (2017), 161-184.

Received: February 2016
Revised: September 2016
First available in Project Euclid: 8 April 2017

Permanent link to this document

Digital Object Identifier

Cancer genomics data integration graphical models


Zhang, Yuping; Ouyang, Zhengqing; Zhao, Hongyu. A statistical framework for data integration through graphical models with application to cancer genomics. Ann. Appl. Stat. 11 (2017), no. 1, 161--184. doi:10.1214/16-AOAS998.

  • Zhang, Y., Ouyang, Z. and Zhao, H. (2017). Supplement to “A statistical framework for data integration through graphical models with application to cancer genomics.” DOI:10.1214/16-AOAS998SUPP.

Supplemental materials

  • Supplement to “A statistical framework for data integration through graphical models with application to cancer genomics.”. We present technical and methodological details regarding the model and algorithm in Section 2 and 4. Furthermore, complementary results for the application in Section 7 are provided.