June 2022 Batch-sequential design and heteroskedastic surrogate modeling for delta smelt conservation
Boya Zhang, Robert B. Gramacy, Leah R. Johnson, Kenneth A. Rose, Eric Smith
Author Affiliations +
Ann. Appl. Stat. 16(2): 816-842 (June 2022). DOI: 10.1214/21-AOAS1521

Abstract

Delta smelt is an endangered fish species in the San Francisco estuary that have shown an overall population decline over the past 30 years. Researchers have developed a stochastic, agent-based simulator to virtualize the system with the goal of understanding the relative contribution of natural and anthropogenic factors that might play a role in their decline. However, the input configuration space is high dimensional, running the simulator is time-consuming, and its noisy outputs change nonlinearly in both mean and variance. Getting enough runs to effectively learn input–output dynamics requires both a nimble modeling strategy and parallel evaluation. Recent advances in heteroskedastic Gaussian process (HetGP) surrogate modeling helps, but little is known about how to appropriately plan experiments for highly distributed simulation. We propose a batch sequential design scheme, generalizing one-at-a-time variance-based active learning for HetGP, as a means of keeping multicore cluster nodes fully engaged with runs. Our acquisition strategy is carefully engineered to favor selection of replicates which boost statistical and computational efficiency when training surrogates to isolate signal from noise. Design and modeling are illustrated on a range of toy examples before embarking on a large-scale smelt simulation campaign and downstream high-fidelity input sensitivity analysis.

Funding Statement

Authors BZ and RBG gratefully acknowledge funding from a DOE LAB 17-1697 via subaward from Argonne National Laboratory for SciDAC/DOE Office of Science ASCR and High Energy Physics.
RBG recognizes partial support from NSF Grant DMS-1821258.
LRJ recognizes partial support from NSF Grant DMS/DEB-1750113.
This work partly performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 with IM release number LLNL-JRNL-815553.

Acknowledgments

We gratefully acknowledge computing support from Virginia Tech’s Advanced Research Computing (ARC) facility. We thank Xinwei Deng, Dave Higdon and Leanna House (Virginia Tech) for valuable insights and suggestions.

Boya Zhang’s current affiliation is at Engineering Division, Lawrence Livermore National Laboratory.

Citation

Download Citation

Boya Zhang. Robert B. Gramacy. Leah R. Johnson. Kenneth A. Rose. Eric Smith. "Batch-sequential design and heteroskedastic surrogate modeling for delta smelt conservation." Ann. Appl. Stat. 16 (2) 816 - 842, June 2022. https://doi.org/10.1214/21-AOAS1521

Information

Received: 1 October 2020; Revised: 1 July 2021; Published: June 2022
First available in Project Euclid: 13 June 2022

MathSciNet: MR4438813
zbMATH: 1498.62298
Digital Object Identifier: 10.1214/21-AOAS1521

Keywords: Active learning , agent-based model , Gaussian process surrogate modeling , input-dependent noise , replication , sensitivity analysis

Rights: Copyright © 2022 Institute of Mathematical Statistics

Vol.16 • No. 2 • June 2022
Back to Top