We consider multivariate two-sample tests of means, where the location shift between the two populations is expected to be related to a known graph structure. An important application of such tests is the detection of differentially expressed genes between two patient populations, as shifts in expression levels are expected to be coherent with the structure of graphs reflecting gene properties such as biological process, molecular function, regulation or metabolism. For a fixed graph of interest, we demonstrate that accounting for graph structure can yield more powerful tests under the assumption of smooth distribution shift on the graph. We also investigate the identification of nonhomogeneous subgraphs of a given large graph, which poses both computational and multiple hypothesis testing problems. The relevance and benefits of the proposed approach are illustrated on synthetic data and on breast and bladder cancer gene expression data analyzed in the context of KEGG and NCI pathways.
"More power via graph-structured tests for differential expression of gene networks." Ann. Appl. Stat. 6 (2) 561 - 600, June 2012. https://doi.org/10.1214/11-AOAS528