The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 9, Number 1 (2015), 300-323.
Inferring gene–gene interactions and functional modules using sparse canonical correlation analysis
Y. X. Rachel Wang, Keni Jiang, Lewis J. Feldman, Peter J. Bickel, and Haiyan Huang
Abstract
Networks pervade many disciplines of science for analyzing complex systems with interacting components. In particular, this concept is commonly used to model interactions between genes and identify closely associated genes forming functional modules. In this paper, we focus on gene group interactions and infer these interactions using appropriate partial correlations between genes, that is, the conditional dependencies between genes after removing the influences of a set of other functionally related genes. We introduce a new method for estimating group interactions using sparse canonical correlation analysis (SCCA) coupled with repeated random partition and subsampling of the gene expression data set. By considering different subsets of genes and ways of grouping them, our interaction measure can be viewed as an aggregated estimate of partial correlations of different orders. Our approach is unique in evaluating conditional dependencies when the correct dependent sets are unknown or only partially known. As a result, a gene network can be constructed using the interaction measures as edge weights and gene functional groups can be inferred as tightly connected communities from the network. Comparisons with several popular approaches using simulated and real data show our procedure improves both the statistical significance and biological interpretability of the results. In addition to achieving considerably lower false positive rates, our procedure shows better performance in detecting important biological pathways.
Article information
Source
Ann. Appl. Stat. Volume 9, Number 1 (2015), 300-323.
Dates
First available in Project Euclid: 28 April 2015
Permanent link to this document
http://projecteuclid.org/euclid.aoas/1430226094
Digital Object Identifier
doi:10.1214/14-AOAS792
Mathematical Reviews number (MathSciNet)
MR3341117
Zentralblatt MATH identifier
06446570
Keywords
Gene association networks community structure sparse canonical correlation analysis (SCCA) partial correlation
Citation
Wang, Y. X. Rachel; Jiang, Keni; Feldman, Lewis J.; Bickel, Peter J.; Huang, Haiyan. Inferring gene–gene interactions and functional modules using sparse canonical correlation analysis. Ann. Appl. Stat. 9 (2015), no. 1, 300--323. doi:10.1214/14-AOAS792. http://projecteuclid.org/euclid.aoas/1430226094.
Supplemental materials
- Supplementary information.: Asymptotic analysis and additional explanations of the procedure, additional simulation and real data results. The code for estimating the edge weight matrix can be requested from hhuang@stat.berkeley.edu. Digital Object Identifier: doi:10.1214/14-AOAS792SUPPSupplemental files available for subscribers.

