The Annals of Statistics

Robust Gaussian stochastic process emulation

Mengyang Gu, Xiaojing Wang, and James O. Berger

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

We consider estimation of the parameters of a Gaussian Stochastic Process (GaSP), in the context of emulation (approximation) of computer models for which the outcomes are real-valued scalars. The main focus is on estimation of the GaSP parameters through various generalized maximum likelihood methods, mostly involving finding posterior modes; this is because full Bayesian analysis in computer model emulation is typically prohibitively expensive.

The posterior modes that are studied arise from objective priors, such as the reference prior. These priors have been studied in the literature for the situation of an isotropic covariance function or under the assumption of separability in the design of inputs for model runs used in the GaSP construction. In this paper, we consider more general designs (e.g., a Latin Hypercube Design) with a class of commonly used anisotropic correlation functions, which can be written as a product of isotropic correlation functions, each having an unknown range parameter and a fixed roughness parameter. We discuss properties of the objective priors and marginal likelihoods for the parameters of the GaSP and establish the posterior propriety of the GaSP parameters, but our main focus is to demonstrate that certain parameterizations result in more robust estimation of the GaSP parameters than others, and that some parameterizations that are in common use should clearly be avoided. These results are applicable to many frequently used covariance functions, for example, power exponential, Matérn, rational quadratic and spherical covariance. We also generalize the results to the GaSP model with a nugget parameter. Both theoretical and numerical evidence is presented concerning the performance of the studied procedures.

Article information

Source
Ann. Statist., Volume 46, Number 6A (2018), 3038-3066.

Dates
Received: October 2016
Revised: August 2017
First available in Project Euclid: 7 September 2018

Permanent link to this document
https://projecteuclid.org/euclid.aos/1536307242

Digital Object Identifier
doi:10.1214/17-AOS1648

Mathematical Reviews number (MathSciNet)
MR3851764

Zentralblatt MATH identifier
06968608

Subjects
Primary: 62F35: Robustness and adaptive procedures 62M20: Prediction [See also 60G25]; filtering [See also 60G35, 93E10, 93E11]

Keywords
Anisotropic covariance emulation Gaussian stochastic process objective priors posterior propriety robust parameter estimation

Citation

Gu, Mengyang; Wang, Xiaojing; Berger, James O. Robust Gaussian stochastic process emulation. Ann. Statist. 46 (2018), no. 6A, 3038--3066. doi:10.1214/17-AOS1648. https://projecteuclid.org/euclid.aos/1536307242


Export citation

References

  • [1] An, J. and Owen, A. (2001). Quasi-regression. J. Complexity 17 588–607.
  • [2] Andrianakis, I. and Challenor, P. G. (2012). The effect of the nugget on Gaussian process emulators of computer models. Comput. Statist. Data Anal. 56 4215–4228.
  • [3] Bayarri, M., Berger, J., Cafeo, J., Garcia-Donato, G., Liu, F., Palomo, J., Parthasarathy, R., Paulo, R., Sacks, J. and Walsh, D. (2007). Computer model validation with functional output. Ann. Statist. 35 1874–1906.
  • [4] Bayarri, M. J., Berger, J. O., Calder, E. S., Dalbey, K., Lunagomez, S., Patra, A. K., Pitman, E. B., Spiller, E. T. and Wolpert, R. L. (2009). Using statistical and computer models to quantify volcanic hazards. Technometrics 51 402–413.
  • [5] Berger, J. O., De Oliveira, V. and Sansó, B. (2001). Objective Bayesian analysis of spatially correlated data. J. Amer. Statist. Assoc. 96 1361–1374.
  • [6] Dette, H. and Pepelyshev, A. (2010). Generalized latin hypercube design for computer experiments. Technometrics 52 421–429.
  • [7] De Oliveira, V. (2007). Objective Bayesian analysis of spatial data with measurement error. Canad. J. Statist. 35 283–301.
  • [8] Diggle, P. and Ribeiro, P. (2007). Model-Based Geostatistics. Springer, Berlin.
  • [9] Dixon, L. (1978). The global optimization problem: An introduction. In Towards Global Optimiation 2 1–15. North-Hollad, Amsterdam.
  • [10] Gelfand, A. E. (2010). Handbook of Spatial Statistics. CRC Press, Boca Raton, FL.
  • [11] Gramacy, R. B. and Lee, H. K. (2009). Adaptive design and analysis of supercomputer experiments. Technometrics 51 130–145.
  • [12] Gu, M., Berger, J. O. et al. (2016). Parallel partial Gaussian process emulation for computer models with massive output. Ann. Appl. Stat. 10 1317–1347.
  • [13] Gu, M., Palomo, J. and Berger, J. (2016). RobustGaSP: Robust Gaussian stochastic process emulation. R package version 0.5.
  • [14] Gu, M., Wang, X. and Berger, J. O. (2018). Supplement to “Robust Gaussian stochastic process emulation.” DOI:10.1214/17-AOS1648SUPP.
  • [15] Handcock, M. S. and Stein, M. L. (1993). A Bayesian analysis of kriging. Technometrics 35 403–410.
  • [16] Handcock, M. S. and Wallis, J. R. (1994). An approach to statistical spatial-temporal modeling of meteorological fields. J. Amer. Statist. Assoc. 89 368–378.
  • [17] Higdon, D. (2002). Space and space–time modeling using process convolutions. In Quantitative Methods for Current Environmental Issues 37–56. Springer, London.
  • [18] Kazianka, H. (2013). Objective Bayesian analysis of geometrically anisotropic spatial data. J. Agric. Biol. Environ. Stat. 18 514–537.
  • [19] Kazianka, H. and Pilz, J. (2012). Objective Bayesian analysis of spatial data with uncertain nugget and range parameters. Canad. J. Statist. 40 304–327.
  • [20] Kennedy, M. C. and O’Hagan, A. (2001). Bayesian calibration of computer models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 63 425–464.
  • [21] Li, R. and Sudjianto, A. (2005). Analysis of computer experiments using penalized likelihood in Gaussian kriging models. Technometrics 47 111–120.
  • [22] Linkletter, C., Bingham, D., Hengartner, N., Higdon, D. and Kenny, Q. Y. (2006). Variable selection for Gaussian process models in computer experiments. Technometrics 48 478–490.
  • [23] Lopes, D. (2011). Development and implementation of Bayesian computer model emulators. Ph.D. thesis, Duke Univ., Durham, NC.
  • [24] Morris, M. D., Mitchell, T. J. and Ylvisaker, D. (1993). Bayesian design and analysis of computer experiments: Use of derivatives in surface prediction. Technometrics 35 243–255.
  • [25] Oakley, J. (1999). Bayesian uncertainty analysis for complex computer codes. Ph.D. thesis, Univ. Sheffield.
  • [26] Oakley, J. and O’Hagan, A. (2002). Bayesian inference for the uncertainty distribution of computer model outputs. Biometrika 89 769–784.
  • [27] Paciorek, C. J. and Schervish, M. J. (2006). Spatial modelling using a new class of nonstationary covariance functions. Environmetrics 17 483–506.
  • [28] Paulo, R. (2005). Default priors for Gaussian processes. Ann. Statist. 33 556–582.
  • [29] Peng, C.-Y. and Wu, C. J. (2014). On the choice of nugget in kriging modeling for deterministic computer experiments. J. Comput. Graph. Statist. 23 151–168.
  • [30] Qian, P. Z. G., Wu, H. and Wu, C. F. J. (2008). Gaussian process models for computer experiments with qualitative and quantitative factors. Technometrics 50 383–396.
  • [31] Ranjan, H. R. and Karsten, R. (2011). A computationally stable approach to Gaussian process interpolation of deterministic computer simulation data. Technometrics 53 366–378.
  • [32] Ren, C., Sun, D. and He, C. (2012). Objective Bayesian analysis for a spatial model with nugget effects. J. Statist. Plann. Inference 142 1933–1946.
  • [33] Ren, C., Sun, D. and Sahu, S. K. (2013). Objective Bayesian analysis of spatial models with separable correlation functions. Canad. J. Statist. 41 488–507.
  • [34] Roustant, O., Ginsbourger, D. and Deville, Y. (2012). DiceKriging, DiceOptim: Two R packages for the analysis of computer experiments by kriging-based metamodeling and optimization. J. Stat. Softw. 51 1–55.
  • [35] Sacks, J., Welch, W. J., Mitchell, T. J. and Wynn, H. P. (1989). Design and analysis of computer experiments. Statist. Sci. 4 409–435.
  • [36] Santner, T. J., Williams, B. J. and Notz, W. I. (2003). The Design and Analysis of Computer Experiments. Springer, New York.
  • [37] Spiller, E. T., Bayarri, M., Berger, J. O., Calder, E. S., Patra, A. K., Pitman, E. B. and Wolpert, R. L. (2014). Automating emulator construction for geophysical hazard maps. SIAM/ASA J. Uncertain. Quantif. 2 126–152.
  • [38] Stein, M. L. (2012). Interpolation of Spatial Data: Some Theory for Kriging. Springer Science & Business Media, Berlin.
  • [39] Surjanovic, S. and Bingham, D. (2017). Virtual library of simulation experiments: Test functions and datasets. Retrieved June 26, 2017, from http://www.sfu.ca/~ssurjano.
  • [40] Zhang, H. (2004). Inconsistent estimation and asymptotically equal interpolations in model-based geostatistics. J. Amer. Statist. Assoc. 99 250–261.
  • [41] Zhang, H. and Zimmerman, D. L. (2005). Towards reconciling two asymptotic frameworks in spatial statistics. Biometrika 92 921–936.
  • [42] Zimmerman, D. L. (1993). Another look at anisotropy in geostatistics. Math. Geol. 25 453–470.

Supplemental materials

  • Supplement to “Robust Gaussian stochastic process emulation”. This supplement consists of four parts: the proofs of Section 3.1, the proofs of Section 3.3, the proofs of 4.3 and the plot of the borehole function in Section 5.3.