Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 10, Number 4 (2016), 2377-2404.
A phylogenetic latent feature model for clonal deconvolution
Tumours develop in an evolutionary process, in which the accumulation of mutations produces subpopulations of cells with distinct mutational profiles, called clones. This process leads to the genetic heterogeneity widely observed in tumour sequencing data, but identifying the genotypes and frequencies of the different clones is still a major challenge. Here, we present Cloe, a phylogenetic latent feature model to deconvolute tumour sequencing data into a set of related genotypes. Our approach extends latent feature models by placing the features as nodes in a latent tree. The resulting model can capture both the acquisition and the loss of mutations, as well as episodes of convergent evolution. We establish the validity of Cloe on synthetic data and assess its performance on controlled biological data, comparing our reconstructions to those of several published state-of-the-art methods. We show that our method provides highly accurate reconstructions and identifies the number of clones, their genotypes and frequencies even at a modest sequencing depth. As a proof of concept, we apply our model to clinical data from three cases with chronic lymphocytic leukaemia and one case with acute myeloid leukaemia.
Ann. Appl. Stat., Volume 10, Number 4 (2016), 2377-2404.
Received: April 2016
Revised: August 2016
First available in Project Euclid: 5 January 2017
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Marass, Francesco; Mouliere, Florent; Yuan, Ke; Rosenfeld, Nitzan; Markowetz, Florian. A phylogenetic latent feature model for clonal deconvolution. Ann. Appl. Stat. 10 (2016), no. 4, 2377--2404. doi:10.1214/16-AOAS986. https://projecteuclid.org/euclid.aoas/1483606864
- Supplement A: Supplementary information. Supplementary text and figures.
- Supplement B: Source code of the analyses. This package contains scripts, data (in the form of matrices of mutant read counts and depths) analysed in this article and a version of Cloe to reproduce the findings.