Abstract
Family history is a major risk factor for many types of cancer. Mendelian risk prediction models translate family histories into cancer risk predictions, based on knowledge of cancer susceptibility genes. These models are widely used in clinical practice to help identify high-risk individuals. Mendelian models leverage the entire family history, but they rely on many assumptions about cancer susceptibility genes that are either unrealistic or challenging to validate, due to low mutation prevalence. Training more flexible models, such as neural networks, on large databases of pedigrees can potentially lead to accuracy gains. In this paper we develop a framework to apply neural networks to family history data and investigate their ability to learn inherited susceptibility to cancer. While there is an extensive literature on neural networks and their state-of-the-art performance in many tasks, there is little work applying them to family history data. We propose adaptations of fully-connected neural networks and convolutional neural networks to pedigrees. In data simulated under Mendelian inheritance, we demonstrate that our proposed neural network models are able to achieve nearly optimal prediction performance. Moreover, when the observed family history includes misreported cancer diagnoses, neural networks are able to outperform the Mendelian BRCAPRO model embedding the correct inheritance laws. Using a large dataset of over 200,000 family histories, the Risk Service cohort, we train prediction models for future risk of breast cancer. We validate the models using data from the Cancer Genetics Network.
Funding Statement
Work supported by the Friends of Dana-Farber Fund, NSF Award 1810829, and NSERC PGSD35023622017.
Acknowledgments
Danielle Braun and Lorenzo Trippa contributed equally. Giovanni Parmigiani and Lorenzo Trippa are also affiliated with the Department of Biostatistics at the Harvard T.H. Chan School of Public Health and Danielle Braun is also affiliated with the Department of Data Sciences at Dana-Farber Cancer Institute. The authors thank the Editor, Associate Editor, and reviewers for their constructive comments and suggestions. The authors also thank Matthew Ploenzke for helpful suggestions.
Citation
Zoe Guan. Giovanni Parmigiani. Danielle Braun. Lorenzo Trippa. "Prediction of hereditary cancers using neural networks." Ann. Appl. Stat. 16 (1) 495 - 520, March 2022. https://doi.org/10.1214/21-AOAS1510
Information