Abstract
Mendelian randomization (MR) is a widely-used method to estimate the causal relationship between a risk factor and disease. A fundamental part of any MR analysis is to choose appropriate genetic variants as instrumental variables. Genome-wide association studies often reveal that hundreds of genetic variants may be robustly associated with a risk factor, but in some situations investigators may have greater confidence in the instrument validity of only a smaller subset of variants. Nevertheless, the use of additional instruments may be optimal from the perspective of mean squared error, even if they are slightly invalid; a small bias in estimation may be a price worth paying for a larger reduction in variance. For this purpose we consider a method for “focused” instrument selection whereby genetic variants are selected to minimise the estimated asymptotic mean squared error of causal effect estimates. In a setting of many weak and locally invalid instruments, we propose a novel strategy to construct confidence intervals for postselection focused estimators that guards against the worst case loss in asymptotic coverage. In empirical applications to: (i) validate lipid drug targets and (ii) investigate vitamin D effects on a wide range of outcomes, our findings suggest that the optimal selection of instruments does not involve only a small number of biologically-justified instruments but also many potentially invalid instruments.
Funding Statement
SB was supported by a Sir Henry Dale Fellowship jointly funded by the Wellcome Trust and the Royal Society (204623/Z/16/Z).
VZ was supported by the United Kingdom Research and Innovation Medical Research Council (MR/W029790/1).
This research was funded by the United Kingdom Research and Innovation Medical Research Council (MC-UU-00002/7) and supported by the National Institute for Health Research Cambridge Biomedical Research Centre: BRC-1215-20014.
Acknowledgments
We thank participants at the 2022 International Society for Clinical Biostatistics conference, participants at the 2023 Siena Workshop on Econometric Theory and Applications, and Dipender Gill for helpful discussions. We thank an anonymous referee for detailed comments.
Citation
Ashish Patel. Francis J. DiTraglia. Verena Zuber. Stephen Burgess. "Selecting invalid instruments to improve Mendelian randomization with two-sample summary data." Ann. Appl. Stat. 18 (2) 1729 - 1749, June 2024. https://doi.org/10.1214/23-AOAS1856
Information