The Annals of Applied Statistics

A phylogenetic latent feature model for clonal deconvolution

Francesco Marass, Florent Mouliere, Ke Yuan, Nitzan Rosenfeld, and Florian Markowetz

Full-text: Open access


Tumours develop in an evolutionary process, in which the accumulation of mutations produces subpopulations of cells with distinct mutational profiles, called clones. This process leads to the genetic heterogeneity widely observed in tumour sequencing data, but identifying the genotypes and frequencies of the different clones is still a major challenge. Here, we present Cloe, a phylogenetic latent feature model to deconvolute tumour sequencing data into a set of related genotypes. Our approach extends latent feature models by placing the features as nodes in a latent tree. The resulting model can capture both the acquisition and the loss of mutations, as well as episodes of convergent evolution. We establish the validity of Cloe on synthetic data and assess its performance on controlled biological data, comparing our reconstructions to those of several published state-of-the-art methods. We show that our method provides highly accurate reconstructions and identifies the number of clones, their genotypes and frequencies even at a modest sequencing depth. As a proof of concept, we apply our model to clinical data from three cases with chronic lymphocytic leukaemia and one case with acute myeloid leukaemia.

Article information

Ann. Appl. Stat. Volume 10, Number 4 (2016), 2377-2404.

Received: April 2016
Revised: August 2016
First available in Project Euclid: 5 January 2017

Permanent link to this document

Digital Object Identifier

Clonal deconvolution tumour heterogeneity latent feature model phylogeny admixture


Marass, Francesco; Mouliere, Florent; Yuan, Ke; Rosenfeld, Nitzan; Markowetz, Florian. A phylogenetic latent feature model for clonal deconvolution. Ann. Appl. Stat. 10 (2016), no. 4, 2377--2404. doi:10.1214/16-AOAS986.

Export citation


Supplemental materials