Abstract
We develop a doubly penalized constrained maximum likelihood (dPCML) method for using summary information from external studies to improve estimation efficiency for an internal study that has individual-level data, in the presence of study population heterogeneity and external information uncertainty. The dPCML method can simultaneously select and incorporate the external information that agrees with the internal study while properly accounting for the uncertainty of the external information. It allows partial information where only some but not all parameter estimates from external models are reported and/or certain parameters are known to be unequal between the internal and external studies. It can still effectively account for the external information uncertainty with only external sample sizes available instead of standard errors of parameter estimates. It covers some existing data integration methods as special cases. A detailed theoretical investigation is carried out to establish asymptotic properties of the dPCML estimator, including estimation consistency, external information selection consistency, and asymptotic normality. We also provide an algorithm for implementation and conduct comprehensive simulation studies. As an application, we build an updated model to study the risk of having high-grade prostate cancer by integrating information from two widely used risk calculators.
Acknowledgments
The authors would like to thank the Editor, Associate Editor, and two referees for their helpful comments that improved the quality of this work.
Citation
Yuqi Zhai. Peisong Han. "Integrating external summary information under population heterogeneity and information uncertainty." Electron. J. Statist. 18 (2) 5304 - 5329, 2024. https://doi.org/10.1214/24-EJS2327
Information