May 2024 Variable Selection Using Bayesian Additive Regression Trees
Chuji Luo, Michael J. Daniels
Author Affiliations +
Statist. Sci. 39(2): 286-304 (May 2024). DOI: 10.1214/23-STS900

Abstract

Variable selection is an important statistical problem. This problem becomes more challenging when the candidate predictors are of mixed type (e.g., continuous and binary) and impact the response variable in nonlinear and/or nonadditive ways. In this paper, we review existing variable selection approaches for the Bayesian additive regression trees (BART) model, a nonparametric regression model, which is flexible enough to capture the interactions between predictors and nonlinear relationships with the response. An emphasis of this review is on the ability to identify relevant predictors. We also propose two variable importance measures, which can be used in a permutation-based variable selection approach, and a backward variable selection procedure for BART. We introduce these variations as a way of illustrating limitations and opportunities for improving current approaches and assess these via simulations.

Funding Statement

Luo and Daniels were partially supported by NIH R01 CA183854. Daniels was also partially supported by NIH R01 HL166324.

Acknowledgments

The authors thank the Editor, Associate Editor and three referees whose comments greatly improved the manuscript.

Citation

Download Citation

Chuji Luo. Michael J. Daniels. "Variable Selection Using Bayesian Additive Regression Trees." Statist. Sci. 39 (2) 286 - 304, May 2024. https://doi.org/10.1214/23-STS900

Information

Published: May 2024
First available in Project Euclid: 5 May 2024

Digital Object Identifier: 10.1214/23-STS900

Keywords: BART , Feature selection , Nonparametric regression

Rights: Copyright © 2024 Institute of Mathematical Statistics

JOURNAL ARTICLE
19 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.39 • No. 2 • May 2024
Back to Top