Open Access
December 2020 Identifying main effects and interactions among exposures using Gaussian processes
Federico Ferrari, David B. Dunson
Ann. Appl. Stat. 14(4): 1743-1758 (December 2020). DOI: 10.1214/20-AOAS1363

Abstract

This article is motivated by the problem of studying the joint effect of different chemical exposures on human health outcomes. This is essentially a nonparametric regression problem, with interest being focused not on a black box for prediction but instead on selection of main effects and interactions. For interpretability we decompose the expected health outcome into a linear main effect, pairwise interactions and a nonlinear deviation. Our interest is in model selection for these different components, accounting for uncertainty and addressing nonidentifiability between the linear and nonparametric components of the semiparametric model. We propose a Bayesian approach to inference, placing variable selection priors on the different components, and developing a Markov chain Monte Carlo (MCMC) algorithm. A key component of our approach is the incorporation of a heredity constraint to only include interactions in the presence of main effects, effectively reducing dimensionality of the model search. We adapt a projection approach developed in the spatial statistics literature to enforce identifiability in modeling the nonparametric component using a Gaussian process. We also employ a dimension reduction strategy to sample the nonlinear random effects that aids the mixing of the MCMC algorithm. The proposed MixSelect framework is evaluated using a simulation study, and is illustrated using data from the National Health and Nutrition Examination Survey (NHANES). Code is available on GitHub.

Citation

Download Citation

Federico Ferrari. David B. Dunson. "Identifying main effects and interactions among exposures using Gaussian processes." Ann. Appl. Stat. 14 (4) 1743 - 1758, December 2020. https://doi.org/10.1214/20-AOAS1363

Information

Received: 1 November 2019; Revised: 1 April 2020; Published: December 2020
First available in Project Euclid: 19 December 2020

MathSciNet: MR4194246
Digital Object Identifier: 10.1214/20-AOAS1363

Keywords: Bayesian modeling , chemical mixtures , Gaussian process , interaction selection , semiparametric , strong heredity , Variable selection

Rights: Copyright © 2020 Institute of Mathematical Statistics

Vol.14 • No. 4 • December 2020
Back to Top