Open Access
June 2020 Multiview cluster aggregation and splitting, with an application to multiomic breast cancer data
Antoine Godichon-Baggioni, Cathy Maugis-Rabusseau, Andrea Rau
Ann. Appl. Stat. 14(2): 752-767 (June 2020). DOI: 10.1214/19-AOAS1317
Abstract

Multiview data, which represent distinct but related groupings of variables, can be useful for identifying relevant and robust clustering structures among observations. A large number of multiview classification algorithms have been proposed in the fields of computer science and genomics; here, we instead focus on the task of merging or splitting an existing hard or soft cluster partition based on multiview data. This article is specifically motivated by an application involving multiomic breast cancer data from The Cancer Genome Atlas, where multiple molecular profiles (gene expression, microRNA expression, methylation and copy number alterations) are used to further subdivide the five currently accepted intrinsic tumor subtypes into distinct subgroups of patients. In addition, we investigate the performance of the proposed multiview splitting and aggregation algorithms, as compared to single- and concatenated-view alternatives, in a set of simulations. The multiview splitting and aggregation algorithms developed here are implemented in the maskmeans R package.

Copyright © 2020 Institute of Mathematical Statistics
Antoine Godichon-Baggioni, Cathy Maugis-Rabusseau, and Andrea Rau "Multiview cluster aggregation and splitting, with an application to multiomic breast cancer data," The Annals of Applied Statistics 14(2), 752-767, (June 2020). https://doi.org/10.1214/19-AOAS1317
Received: 1 November 2018; Published: June 2020
Vol.14 • No. 2 • June 2020
Back to Top