Abstract
Cytometry is the standard multi-parameter assay for measuring single cell phenotype and functionality. It is commonly used for quantifying the relative frequencies of cell subsets in blood and disaggregated tissues. A typical analysis of cytometry data involves cell classification—that is, the identification of cell subgroups in the sample—and comparisons of the cell subgroups across samples or conditions. While modern experiments often necessitate the collection and processing of samples in multiple batches, analysis of cytometry data across batches is challenging because differences across samples may occur due to either true biological variation or technical reasons such as antibody lot effects or instrument optics across batches. Thus a critical step in comparative analyses of multi-sample cytometry data—yet missing in existing automated methods for analyzing such data—is cross-sample calibration, whose goal is to align corresponding cell subsets across multiple samples in the presence of technical variations, so that biological variations can be meaningfully compared. We introduce a Bayesian nonparametric hierarchical modeling approach for accomplishing both calibration and cell classification simultaneously in a unified probabilistic manner. Three important features of our method make it particularly effective for analyzing multi-sample cytometry data: a nonparametric mixture avoids prespecifying the number of cell clusters; a hierarchical skew normal kernel that allows flexibility in the shapes of the cell subsets and cross-sample variation in their locations; and finally the “coarsening” strategy makes inference robust to departures from the model not captured by the skew normal kernels. We demonstrate the merits of our approach in simulated examples and carry out a case study in the analysis of a multi-sample cytometry data set. We provide an R package for our method.
Funding Statement
Li Ma’s research is partly supported by National Science Foundation grants DMS-1749789 and DMS-2013930. This work has also been supported in part through an EQAPOL collaboration with federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Contract Number HHSN272201700061C and the Duke University Center for AIDS Research (CFAR), an NIH funded program (5P30 AI064518).
Acknowledgments
We thank two anonymous referees and an editor for their thoughtful and constructive comments that helped us improve the paper.
Citation
Shai Gorsky. Cliburn Chan. Li Ma. "Coarsened Mixtures of Hierarchical Skew Normal Kernels for Flow and Mass Cytometry Analyses." Bayesian Anal. 19 (2) 439 - 463, June 2024. https://doi.org/10.1214/22-BA1356
Information