December 2023 A general framework for penalized mixed-effects multitask learning with applications on DNA methylation surrogate biomarkers creation
Andrea Cappozzo, Francesca Ieva, Giovanni Fiorito
Author Affiliations +
Ann. Appl. Stat. 17(4): 3257-3282 (December 2023). DOI: 10.1214/23-AOAS1760

Abstract

Recent evidence highlights the usefulness of DNA methylation (DNAm) biomarkers as surrogates for exposure to risk factors for noncommunicable diseases in epidemiological studies and randomized trials. DNAm variability has been demonstrated to be tightly related to lifestyle behavior and exposure to environmental risk factors, ultimately providing an unbiased proxy of an individual state of health. At present, the creation of DNAm surrogates relies on univariate penalized regression models, with elastic-net regularizer being the gold standard when accomplishing the task. Nonetheless, more advanced modeling procedures are required in the presence of multivariate outcomes with a structured dependence pattern among the study samples. In this work we propose a general framework for mixed-effects multitask learning in presence of high-dimensional predictors to develop a multivariate DNAm biomarker from a multicenter study. A penalized estimation scheme, based on an expectation-maximization algorithm, is devised in which any penalty criteria for fixed-effects models can be conveniently incorporated in the fitting process. We apply the proposed methodology to create novel DNAm surrogate biomarkers for multiple correlated risk factors for cardiovascular diseases and comorbidities. We show that the proposed approach, modeling multiple outcomes together, outperforms state-of-the-art alternatives both in predictive power and biomolecular interpretation of the results.

Acknowledgments

The authors would like to thank the three anonymous reviewers, the Editor, and the Associate Editor for their thorough examination of the manuscript and their valuable, constructive comments. Their input has greatly enhanced the quality of the work. The authors also thank the EPIC Italy research group (Carlotta Sacerdote, Vittorio Krogh, Domenico Palli, Salvatore Panico, Rosario Tumino, Paolo Vineis and their collaborators) for giving access to the data used in this study. Francesca Ieva acknowledges her affiliation to Human Technopole (https://humantechnopole.it/en/), where she holds the position of Associate Head at the Health Data Science Center (francesca.ieva@fht.org).

Citation

Download Citation

Andrea Cappozzo. Francesca Ieva. Giovanni Fiorito. "A general framework for penalized mixed-effects multitask learning with applications on DNA methylation surrogate biomarkers creation." Ann. Appl. Stat. 17 (4) 3257 - 3282, December 2023. https://doi.org/10.1214/23-AOAS1760

Information

Received: 1 April 2022; Revised: 1 February 2023; Published: December 2023
First available in Project Euclid: 30 October 2023

MathSciNet: MR4661697
Digital Object Identifier: 10.1214/23-AOAS1760

Keywords: EM algorithm , Mixed-effects models , multitask learning , multivariate regression , penalized estimation , Personalized medicine

Rights: Copyright © 2023 Institute of Mathematical Statistics

Vol.17 • No. 4 • December 2023
Back to Top