When treatment effect heterogeneity exists, identifying the subgroup of patients who would benefit from an active treatment relative to a control is an important question. This article focuses on subgroup identification in the presence of a large dimensional set of covariates, with the number of covariates possibly greater than the sample size. We approach this problem from the perspective of optimal treatment decision rules and propose methods that can simultaneously estimate the treatment decision rule and select prescriptive variables important for treatment decision making and subgroup identification. The proposed methods are built within a robust classification framework based on doubly robust augmented inverse probability weighted estimators (AIPWE), hence sharing the robustness property. An (lasso-type) penalty is used within the classification framework to target selection of prescriptive variables. We further propose a backward elimination process for fine-tuning selection. The methods can be conveniently implemented by taking advantage of standard software for logistic regression and lasso. The methods are evaluated by extensive simulation studies which demonstrated the superior and robust performance of the proposed methods relative to existing ones. In addition, the estimated decision rules from the proposed methods are considerably simpler than other methods. We applied various methods to identify the subgroup of patients suitable for each of the two commonly used anticoagulants in terms of bleeding risk for patients with acute myocardial infarction undergoing percutaneous coronary intervention.
The first author’s work is supported by National Natural Science Foundation of China (71701120), Program for Innovative Research Team of Shanghai University of Finance and Economics.
"Subgroup identification and variable selection for treatment decision making." Ann. Appl. Stat. 16 (1) 40 - 59, March 2022. https://doi.org/10.1214/21-AOAS1468