Open Access
June 2020 Multiview cluster aggregation and splitting, with an application to multiomic breast cancer data
Antoine Godichon-Baggioni, Cathy Maugis-Rabusseau, Andrea Rau
Ann. Appl. Stat. 14(2): 752-767 (June 2020). DOI: 10.1214/19-AOAS1317

Abstract

Multiview data, which represent distinct but related groupings of variables, can be useful for identifying relevant and robust clustering structures among observations. A large number of multiview classification algorithms have been proposed in the fields of computer science and genomics; here, we instead focus on the task of merging or splitting an existing hard or soft cluster partition based on multiview data. This article is specifically motivated by an application involving multiomic breast cancer data from The Cancer Genome Atlas, where multiple molecular profiles (gene expression, microRNA expression, methylation and copy number alterations) are used to further subdivide the five currently accepted intrinsic tumor subtypes into distinct subgroups of patients. In addition, we investigate the performance of the proposed multiview splitting and aggregation algorithms, as compared to single- and concatenated-view alternatives, in a set of simulations. The multiview splitting and aggregation algorithms developed here are implemented in the maskmeans R package.

Citation

Download Citation

Antoine Godichon-Baggioni. Cathy Maugis-Rabusseau. Andrea Rau. "Multiview cluster aggregation and splitting, with an application to multiomic breast cancer data." Ann. Appl. Stat. 14 (2) 752 - 767, June 2020. https://doi.org/10.1214/19-AOAS1317

Information

Received: 1 November 2018; Revised: 1 November 2019; Published: June 2020
First available in Project Euclid: 29 June 2020

zbMATH: 07239882
MathSciNet: MR4117828
Digital Object Identifier: 10.1214/19-AOAS1317

Keywords: cluster merging and splitting , clustering , multiomic data , multiview , TCGA

Rights: Copyright © 2020 Institute of Mathematical Statistics

Vol.14 • No. 2 • June 2020
Back to Top