December 2024 Background modeling for double Higgs boson production: Density ratios and optimal transport
Tudor Manole, Patrick Bryant, John Alison, Mikael Kuusela, Larry Wasserman
Author Affiliations +
Ann. Appl. Stat. 18(4): 2950-2978 (December 2024). DOI: 10.1214/24-AOAS1916

Abstract

We study the problem of data-driven background estimation, arising in the search of physics signals predicted by the Standard Model at the Large Hadron Collider. Our work is motivated by the search for the production of pairs of Higgs bosons decaying into four bottom quarks. A number of other physical processes, known as background, also share the same final state. The data arising in this problem is, therefore, a mixture of unlabeled background and signal events, and the primary aim of the analysis is to determine whether the proportion of unlabeled signal events is nonzero. A challenging but necessary first step is to estimate the distribution of background events. Past work in this area has determined regions of the space of collider events, where signal is unlikely to appear and where the background distribution is, therefore, identifiable. The background distribution can be estimated in these regions and extrapolated into the region of primary interest using transfer learning with a multivariate classifier. We build upon this existing approach in two ways. First, we revisit this method by developing a customized residual neural network which is tailored to the structure and symmetries of collider data. Second, we develop a new method for background estimation, based on the optimal transport problem, which relies on modeling assumptions distinct from earlier work. These two methods can serve as cross-checks for each other in particle physics analyses, due to the complementarity of their underlying assumptions. We compare their performance on simulated double Higgs boson data.

Funding Statement

This work was supported in part by NSF Grants PHY-2020295, DMS-2053804, and DMS-2310632.
TM was supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC) through a PGS D Scholarship.

Acknowledgments

We are grateful to the CMU Statistical Methods for the Physical Sciences (STAMPS) research group for insightful discussions and feedback throughout this work.

Citation

Download Citation

Tudor Manole. Patrick Bryant. John Alison. Mikael Kuusela. Larry Wasserman. "Background modeling for double Higgs boson production: Density ratios and optimal transport." Ann. Appl. Stat. 18 (4) 2950 - 2978, December 2024. https://doi.org/10.1214/24-AOAS1916

Information

Received: 1 March 2023; Revised: 1 April 2024; Published: December 2024
First available in Project Euclid: 31 October 2024

Digital Object Identifier: 10.1214/24-AOAS1916

Keywords: domain adaptation , high energy physics , Large Hadron Collider , Optimal transport map , residual neural network , transfer learning , Wasserstein distance

Rights: Copyright © 2024 Institute of Mathematical Statistics

Vol.18 • No. 4 • December 2024
Back to Top