Open Access
June 2017 Integrative sparse $K$-means with overlapping group lasso in genomic applications for disease subtype discovery
Zhiguang Huo, George Tseng
Ann. Appl. Stat. 11(2): 1011-1039 (June 2017). DOI: 10.1214/17-AOAS1033

Abstract

Cancer subtypes discovery is the first step to deliver personalized medicine to cancer patients. With the accumulation of massive multi-level omics datasets and established biological knowledge databases, omics data integration with incorporation of rich existing biological knowledge is essential for deciphering a biological mechanism behind the complex diseases. In this manuscript, we propose an integrative sparse $K$-means (IS-$K$means) approach to discover disease subtypes with the guidance of prior biological knowledge via sparse overlapping group lasso. An algorithm using an alternating direction method of multiplier (ADMM) will be applied for fast optimization. Simulation and three real applications in breast cancer and leukemia will be used to compare IS-$K$means with existing methods and demonstrate its superior clustering accuracy, feature selection, functional annotation of detected molecular features and computing efficiency.

Citation

Download Citation

Zhiguang Huo. George Tseng. "Integrative sparse $K$-means with overlapping group lasso in genomic applications for disease subtype discovery." Ann. Appl. Stat. 11 (2) 1011 - 1039, June 2017. https://doi.org/10.1214/17-AOAS1033

Information

Received: 1 January 2016; Revised: 1 April 2017; Published: June 2017
First available in Project Euclid: 20 July 2017

zbMATH: 06775902
MathSciNet: MR3693556
Digital Object Identifier: 10.1214/17-AOAS1033

Keywords: ADMM , Cancer subtype , omics integrative analysis , overlapping group lasso

Rights: Copyright © 2017 Institute of Mathematical Statistics

Vol.11 • No. 2 • June 2017
Back to Top