Structured hierarchical models for probabilistic inference from perturbation screening data

Simon Dirmeier; Niko Beerenwinkel

doi:10.1214/21-AOAS1580

September 2022 Structured hierarchical models for probabilistic inference from perturbation screening data

Simon Dirmeier, Niko Beerenwinkel

Author Affiliations +

Ann. Appl. Stat. 16(3): 2010-2029 (September 2022). DOI: 10.1214/21-AOAS1580

Abstract

Genetic perturbation screening is an experimental method in biology to study cause and effect relationships between different biological entities. However, knocking out or knocking down genes is a highly error-prone process that complicates estimation of the effect sizes of the interventions. Here, we introduce a family of generative models, called the structured hierarchical model (SHM) for probabilistic inference of causal effects from perturbation screens. SHMs utilize classical hierarchical models to represent heterogeneous data and combine them with categorical Markov random fields to encode biological prior information over functionally related biological entities. The random field induces a clustering of functionally related genes which informs inference of parameters in the hierarchical model. The SHM is designed for extremely noisy data sets for which the true data generating process is difficult to model due to lack of domain knowledge or high stochasticity of the interventions. We apply the SHM to a pan-cancer genetic perturbation screen in order to identify genes that restrict the growth of an entire group of cancer cell lines and show that incorporating prior knowledge in the form of a graph improves inference of parameters.

Citation

Download Citation

Simon Dirmeier. Niko Beerenwinkel. "Structured hierarchical models for probabilistic inference from perturbation screening data." Ann. Appl. Stat. 16 (3) 2010 - 2029, September 2022. https://doi.org/10.1214/21-AOAS1580

Information

Received: 1 November 2019; Revised: 1 July 2021; Published: September 2022

First available in Project Euclid: 19 July 2022

MathSciNet: MR4455354

zbMATH: 1498.62208

Digital Object Identifier: 10.1214/21-AOAS1580

Keywords: biological network , genetic perturbation screen , hierarchical models , interventional data , Markov random fields , probabilistic models , PyMC3 , Python

Access the abstract

JOURNAL ARTICLE
20 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY