Statistical Science

Modern Statistical Methods in Oceanography: A Hierarchical Perspective

Christopher K. Wikle, Ralph F. Milliff, Radu Herbei, and William B. Leeds

Full-text: Open access


Processes in ocean physics, air–sea interaction and ocean biogeochemistry span enormous ranges in spatial and temporal scales, that is, from molecular to planetary and from seconds to millennia. Identifying and implementing sustainable human practices depend critically on our understandings of key aspects of ocean physics and ecology within these scale ranges. The set of all ocean data is distorted such that three- and four-dimensional (i.e., time-dependent) in situ data are very sparse, while observations of surface and upper ocean properties from space-borne platforms have become abundant in the past few decades. Precisions in observations of all types vary as well. In the face of these challenges, the interface between Statistics and Oceanography has proven to be a fruitful area for research and the development of useful models. With the recognition of the key importance of identifying, quantifying and managing uncertainty in data and models of ocean processes, a hierarchical perspective has become increasingly productive. As examples, we review a heterogeneous mix of studies from our own work demonstrating Bayesian hierarchical model applications in ocean physics, air–sea interaction, ocean forecasting and ocean ecosystem models. This review is by no means exhaustive and we have endeavored to identify hierarchical modeling work reported by others across the broad range of ocean-related topics reported in the statistical literature. We conclude by noting relevant ocean-statistics problems on the immediate research horizon, and some technical challenges they pose, for example, in terms of nonlinearity, dimensionality and computing.

Article information

Statist. Sci., Volume 28, Number 4 (2013), 466-486.

First available in Project Euclid: 3 December 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bayesian biogeochemical ecosystem ocean vector winds quadratic nonlinearity spatio-temporal state–space sea surface temperature


Wikle, Christopher K.; Milliff, Ralph F.; Herbei, Radu; Leeds, William B. Modern Statistical Methods in Oceanography: A Hierarchical Perspective. Statist. Sci. 28 (2013), no. 4, 466--486. doi:10.1214/13-STS436.

Export citation


  • Aldrin, M., Holden, M., Guttorp, P., Skeie, R. B., Myhre, G. and Berntsen, T. K. (2012). Bayesian estimation of climate sensitivity based on a simple climate model fitted to observations of hemispheric temperatures and global ocean heat content. Environmetrics 23 253–271.
  • Andrieu, C., Doucet, A. and Holenstein, R. (2010). Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B Stat. Methodol. 72 269–342.
  • Banerjee, S., Carlin, B. and Gelfand, A. (2003). Hierarchical Modeling and Analysis for Spatial Data 101. Chapman & Hall/CRC, Boca Raton.
  • Barnett, T. (1981). Statistical prediction of North American air temperatures from Pacific predictors. Monthly Weather Review 109 1021–1041.
  • Barnett, T. and Preisendorfer, R. (1987). Origins and levels of monthly and seasonal forecast skill for United States surface air temperatures determined by canonical correlation analysis. Monthly Weather Review 115 1825–1850.
  • Barnston, A., He, Y. and Glantz, M. (1999). Predictive skill of statistical and dynamical climate models in SST forecasts during the 1997–98 El Niño episode and the 1998 La Niña onset. Bulletin of the American Meteorological Society 80 217–243.
  • Beaumont, M. A., Zhang, W. and Balding, D. J. (2002). Approximate Bayesian computation in population genetics. Genetics 162 2025–2035.
  • Bennett, A. F. (2002). Inverse Modeling of the Ocean and Atmosphere. Cambridge Univ. Press, Cambridge.
  • Berliner, L. M. (1996). Hierarchical Bayesian time series models. In Maximum Entropy and Bayesian Methods (Santa Fe, NM, 1995). Fund. Theories Phys. 79 15–22. Kluwer Academic, Dordrecht.
  • Berliner, L., Wikle, C. and Cressie, N. (2000). Long-lead prediction of Pacific SSTs via Bayesian dynamic modeling. Journal of Climate 13 3953–3968.
  • Beskos, A., Kalogeropoulos, K. and Pazos, E. (2013). Advanced MCMC methods for sampling on diffusion pathspace. Stochastic Process. Appl. 123 1415–1453.
  • Burgers, G. and Stephenson, D. (1999). The “normality” of El Nino. Geophys. Res. Lett. 26 1027–1030.
  • Calder, C., Berrett, C., Shi, T., Xiao, N. and Munroe, D. (2011). Modeling space–time dynamics of aerosols using satellite data and atmospheric transport model output. J. Agric. Biol. Environ. Stat. 16 495–512.
  • Chelton, D. (1994). Physical oceanography: A brief overview for statisticians. Statist. Sci. 9 150–166.
  • Clark, J. S. (2007). Models for Ecological Data: An Introduction. Princeton Univ. Press, Princeton, NJ.
  • Cloern, J., Hieb, K., Jacobson, T., Sansó, B., Di Lorenzo, E., Stacey, M., Largier, J., Meiring, W., Peterson, W. and Powell, T. et al. (2010). Biological communities in San Francisco Bay track large-scale climate forcing over the North Pacific. Geophys. Res. Lett. 37 L21602.
  • Conn, P., Johnson, D., London, J. and Boveng, P. (2012). Accounting for missing data when assessing availability in animal population surveys: An application to ice-associated seals in the Bering Sea. Methods in Ecology and Evolution 3 1039–1046.
  • Conti, S., Gosling, J. P., Oakley, J. E. and O’Hagan, A. (2009). Gaussian process emulation of dynamic computer codes. Biometrika 96 663–676.
  • Cressie, N. and Wikle, C. K. (2011). Statistics for Spatio-temporal Data. Wiley, Hoboken, NJ.
  • Cressie, N., Calder, C., Clark, J., Hoef, J. and Wikle, C. (2009). Accounting for uncertainty in ecological analysis: The strengths and limitations of hierarchical statistical modeling. Ecological Applications 19 553–570.
  • Currin, C., Mitchell, T., Morris, M. and Ylvisaker, D. (1991). Bayesian prediction of deterministic functions, with applications to the design and analysis of computer experiments. J. Amer. Statist. Assoc. 86 953–963.
  • Daley, R. (1991). Atmospheric Data Analysis. Cambridge Atmospheric and Space Science Series 4 57. Cambridge Univ. Press, Cambridge.
  • Davis, R. (1976). Predictability of sea surface temperature and sea level pressure anomalies over the North Pacific Ocean. Journal of Physical Oceanography 6 249–266.
  • Dorn, M. (2002). Advice on West Coast rockfish harvest rates from Bayesian meta-analysis of stock-recruit relationships. North American Journal of Fisheries Management 22 280–300.
  • Dowd, M. (2006). A sequential Monte Carlo approach for marine ecological prediction. Environmetrics 17 435–455.
  • Dowd, M. (2007). Bayesian statistical data assimilation for ecosystem models using Markov Chain Monte Carlo. Journal of Marine Systems 68 439–456.
  • Dowd, M. (2011). Estimating parameters for a stochastic dynamic marine ecological system. Environmetrics 22 501–515.
  • Elsner, J., Murnane, R. and Jagger, T. (2006). Forecasting US hurricanes 6 months in advance. Geophys. Res. Lett. 33 L10704.
  • Emery, W. and Thomson, R. (2001). Data Analysis Methods in Physical Oceanography. Elsevier, Amsterdam.
  • Evensen, G. (1994). Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. Journal of Geophysical Research 99 10.
  • Evensen, G. (2009). Data Assimilation: The Ensemble Kalman Filter, 2nd ed. Springer, Berlin.
  • Fearnhead, P. and Prangle, D. (2012). Constructing summary statistics for approximate Bayesian computation: Semi-automatic approximate Bayesian computation. J. R. Stat. Soc. Ser. B Stat. Methodol. 74 419–474.
  • Fiechter, J. and Moore, A. M. (2012). Iron limitation impact on eddy–induced ecosystem variability in the coastal Gulf of Alaska. Journal of Marine Systems 92 1–15.
  • Flay, S. and Nott, J. (2007). Effect of ENSO on Queensland seasonal landfalling tropical cyclone activity. International Journal of Climatology 27 1327–1334.
  • Freilich, M. (1996). Sea winds algorithm theoretical basis document. Jet Propulsion Laboratory, Pasadena, CA.
  • Frolov, S., Baptista, A., Leen, T., Lu, Z. andvan der Merwe, R. (2009). Fast data assimilation using a nonlinear Kalman filter and a model surrogate: An application to the Columbia River estuary. Dynamics of Atmospheres and Oceans 48 16–45.
  • Furrer, R., Sain, S. R., Nychka, D. and Meehl, G. A. (2007). Multivariate Bayesian analysis of atmosphere-ocean general circulation models. Environ. Ecol. Stat. 14 249–266.
  • Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. (2004). Bayesian Data Analysis, 2nd ed. Chapman & Hall/CRC, Boca Raton, FL.
  • George, E. and McCulloch, R. (1993). Variable selection via Gibbs sampling. J. Amer. Statist. Assoc. 88 881–889.
  • George, E. and McCulloch, R. (1997). Approaches for Bayesian variable selection. Statist. Sinica 7 339–374.
  • Ghil, M. and Malanotte-Rizzoli, P. (1991). Data assimilation in meteorology and oceanography. Adv. Geophys 33 141–266.
  • Haario, H., Saksman, E. and Tamminen, J. (2001). An adaptive Metropolis algorithm. Bernoulli 7 223–242.
  • Hanks, E., Hooten, M., Johnson, D. and Sterling, J. (2011). Velocity-based movement modeling for individual and population level inference. PloS One 6 e22795.
  • Harmon, R. and Challenor, P. (1997). A Markov chain Monte Carlo method for estimation and assimilation into models. Ecological Modelling 101 41–59.
  • Hellerman, S. and Rosenstein, M. (1983). Normal monthly wind stress over the world ocean with error estimates. Journal of Physical Oceanography 13 1093–1104.
  • Herbei, R., McKeague, I. W. and Speer, K. G (2008). Gyres and jets: Inversion of tracer data for ocean circulation structure. J. Phys. Ocean. 38 1180–1202.
  • Higdon, D. (1998). A process-convolution approach to modelling temperatures in the North Atlantic Ocean. Environ. Ecol. Stat. 5 173–190.
  • Hilborn, R., Pikitch, E. and McAllister, M. (1994). A Bayesian estimation and decision analysis for an age-structured model using biomass survey data. Fisheries Research 19 17–30.
  • Hirst, D., Storvik, G., Aldrin, M., Aanes, S. and Huseby, R. (2005). Estimating catch-at-age by combining data from different sources. Canadian Journal of Fisheries and Aquatic Sciences 62 1377–1385.
  • Hiruki-Raring, L., Hoef, J., Boveng, P. and Bengtson, J. (2012). A Bayesian hierarchical model of Antarctic fur seal foraging and pup growth related to sea ice and prey abundance. Ecological Applications 22 668–684.
  • Hoerling, M., Kumar, A. and Zhong, M. (1997). El Niño, La Niña, and the nonlinearity of their teleconnections. Journal of Climate 10 1769–1786.
  • Hogg, N. G. and Owens, W. B. (1999). Direct measurement of the deep circulation within the Brazil Basin. Deep-Sea Res. 46 335–353.
  • Holton, J. (2004). An Introduction to Dynamic Meteorology, 4th ed. Elsevier Academic Press, Burlington, MA.
  • Hooten, M. B. and Wikle, C. K. (2010). Statistical agent-based models for discrete spatio-temporal systems. J. Amer. Statist. Assoc. 105 236–248.
  • Hooten, M. B., Leeds, W. B., Fiechter, J. and Wikle, C. K. (2011). Assessing first-order emulator inference for physical parameters in nonlinear mechanistic models. J. Agric. Biol. Environ. Stat. 16 475–494.
  • Johnson, D. S., London, J. M., Lea, M.-A. and Durban, J. W. (2008). Continuous-time correlated random walk model for animal telemetry data. Ecology 89 1208–1215.
  • Jolliffe, I. (2002). Principal Component Analysis, 2nd ed. Springer, New York.
  • Jonsen, I., Myers, R. and James, M. (2007). Identifying leatherback turtle foraging behaviour from satellite telemetry using a switching state-space model. Marine Ecology Progress Series 337 255–264.
  • Kalnay, E. (2003). Atmospheric Modeling, Data Assimilation and Predictability. Cambridge Univ. Press, Cambridge.
  • Kirsch, A. (1996). An Introduction to the Mathematical Theory of Inverse Problems. Applied Mathematical Sciences 120. Springer, New York.
  • Kondrashov, D., Kravtsov, S., Robertson, A. and Ghil, M. (2005). A hierarchy of data-based ENSO models. Journal of Climate 18 4425–4444.
  • Large, W. (2006). Surface fluxes for practitioners of global data assimilation. In Ocean Weather Forecasting (E. Chassignet and J. Verron, eds.) 229–270. Springer, Dordrecht.
  • Lavine, M. and Lozier, S. (1999). A Markov random field spatio-temporal analysis of ocean temperature. Environ. Ecol. Stat. 6 249–273.
  • Leeds, W., Wikle, C. and Fiechter, J. (2013). Emulator-assisted reduced-rank ecological data assimilation for nonlinear multivariate dynamical spatio-temporal processes. Stat. Methodol. To appear. DOI:10.1016/j.statmet.2012.11.004.
  • Leeds, W., Wikle, C., Fiechter, J., Brown, J. and Milliff, R. (2013). Modeling 3-D spatio-temporal biogeochemical processes with a forest of 1-D statistical emulators. Environmetrics 24 1–12.
  • Lemos, R. T. and Sansó, B. (2009). A spatio-temporal model for mean, anomaly, and trend fields of North Atlantic sea surface temperature. J. Amer. Statist. Assoc. 104 5–18.
  • Lemos, R., Sansó, B. and Santos, F. (2009). Hierarchical Bayesian modelling of wind and sea surface temperature from the Portuguese coast. International Journal of Climatology 30 1423–1430.
  • Lemos, R. T. and Sansó, B. (2012). Conditionally linear models for non-homogeneous spatial random fields. Stat. Methodol. 9 275–284.
  • Lima, C. and Lall, U. (2009). Hierarchical Bayesian modeling of multisite daily rainfall occurrence: Rainy season onset, peak, and end. Water Resources Research 45 W07422.
  • Liu, F. and West, M. (2009). A dynamic modelling strategy for Bayesian computer model emulation. Bayesian Anal. 4 393–411.
  • Lorenc, A. (1986). Analysis methods for numerical weather prediction. Quarterly Journal of the Royal Meteorological Society 112 1177–1194.
  • Lozier, M. S., Owens, W. and Curry, R. (1995). The climatology of the North Atlantic. Prog. Oceanogr. 36 1–44.
  • Majda, A. J. and Harlim, J. (2013). Physics constrained nonlinear regression models for time series. Nonlinearity 26 201–217.
  • Majda, A. J. and Yuan, Y. (2012). Fundamental limitations of ad hoc linear and quadratic multi-level regression models for physical systems. Discrete Contin. Dyn. Syst. 4 1333–1363.
  • Margvelashvili, N. and Campbell, E. (2012). Sequential data assimilation in fine-resolution models using error-subspace emulators: Theory and preliminary evaluation. Journal of Marine Systems 90 13–22.
  • Matsuno, T. (1966). Quasi-geostrophic motions in the equatorial area. J. Meteor. Soc. Japan 44 25–43.
  • Mattern, J., Fennel, K. and Dowd, M. (2012). Estimating time-dependent parameters for a biological ocean model using an emulator approach. Journal of Marine Systems 96 32–47.
  • McAllister, M. and Kirkwood, G. (1998). Bayesian stock assessment: A review and example application using the logistic model. ICES Journal of Marine Science: Journal du Conseil 55 1031–1060.
  • McClintock, B., King, R., Thomas, L., Matthiopoulos, J., McConnell, B. and Morales, J. (2012). A general discrete-time modeling framework for animal movement using multistate random walks. Ecological Monographs 82 335–349.
  • McKeague, I. W., Nicholls, G. K., Speer, K. G. and Herbei, R. (2005). Statistical inversion of South Atlantic circulation in an abyssal neutral density layer. J. Mar. Res. 63 683–704.
  • McWilliams, J. (2006). Fundamentals of Geophysical Fluid Dynamics. Cambridge Univ. Press, Cambridge.
  • Megrey, B., Rose, K., Klumb, R., Hay, D., Werner, F., Eslinger, D. and Smith, S. (2007). A bioenergetics-based population dynamics model of Pacific herring (Clupea harengus pallasi) coupled to a lower trophic level nutrient–phytoplankton–zooplankton model: Description, calibration, and sensitivity analysis. Ecological Modelling 202 144–164.
  • Michielsens, C. and McAllister, M. (2004). A Bayesian hierarchical analysis of stock recruit data: Quantifying structural and parameter uncertainties. Canadian Journal of Fisheries and Aquatic Sciences 61 1032–1047.
  • Miller, C. (2004). Biological Oceanography. Blackwell, Oxford.
  • Milliff, R., Large, W., Morzel, J., Danabasoglu, G. and Chin, T. (1999). Ocean general circulation model sensitivity to forcing from scatterometer winds. Journal of Geophysical Research 104 11337–11411.
  • Milliff, R., Bonazzi, A., Wikle, C., Pinardi, N. and Berliner, L. (2011). Ocean ensemble forecasting. Part I: Ensemble Mediterranean winds from a Bayesian hierarchical model. Quarterly Journal of the Royal Meteorological Society 137 858–878.
  • Moore, J. and Barlow, J. (2011). Bayesian state-space model of fin whale abundance trends from a 1991–2008 time series of line-transect surveys in the California Current. Journal of Applied Ecology 48 1195–1205.
  • NRC (1994). Report on statistics and physical oceanography. Statist. Sci. 9 167–201.
  • Paillet, J. and Mercier, H. (1997). An inverse model of the eastern North Atlantic general circulation and thermocline ventilation. Deep-Sea Res. 44 1293–1328.
  • Parslow, J., Cressie, N., Campbell, E., Jones, E. and Murray, L. (2013). Bayesian learning and predictability in a stochastic nonlinear dynamical model. Ecological Applications 23 679–698.
  • Pedlosky, J. (1998). Ocean Circulation Theory. Springer, Berlin.
  • Penland, C. and Magorian, T. (1993). Prediction of Niño 3 sea surface temperatures using linear inverse modeling. Journal of Climate 6 1067–1076.
  • Philander, S. (1990). El Niño, La Niña, and the Southern Oscillation 46. Academic Press, San Diego, CA.
  • Pinardi, N., Bonazzi, A., Dobricic, S., Milliff, R., Wikle, C. and Berliner, L. (2011). Ocean ensemble forecasting. Part II: Mediterranean Forecast System response. Quarterly Journal of the Royal Meteorological Society 137 879–893.
  • Preisendorfer, R. and Mobley, C. (1988). Principal Component Analysis in Meteorology and Oceanography 425. Elsevier, New York.
  • Robert, C. P. and Casella, G. (2004). Monte Carlo Statistical Methods, 2nd ed. Springer, New York.
  • Royle, J. and Dorazio, R. (2008). Hierarchical Modeling and Inference in Ecology: The Analysis of Data from Populations, Metapopulations and Communities. Academic Press, San Diego, CA.
  • Ruiz, J., González-Quirós, R., Prieto, L. and Navarro, G. (2009). A Bayesian model for anchovy (Engraulis encrasicolus): The combined forcing of man and environment. Fisheries Oceanography 18 62–76.
  • Sacks, J., Welch, W. J., Mitchell, T. J. and Wynn, H. P. (1989). Design and analysis of computer experiments. Statist. Sci. 4 409–435.
  • Satterthwaite, W., Mohr, M., O’Farrell, M., Wells, B. and Walters, C. (2012). A Bayesian hierarchical model of size-at-age in ocean-harvested stocks—quantifying effects of climate and temporal variability. Canadian Journal of Fisheries and Aquatic Sciences 69 942–954.
  • Stroud, J. R., Stein, M. L., Lesht, B. M., Schwab, D. J. and Beletsky, D. (2010). An ensemble Kalman filter and smoother for satellite data assimilation. J. Amer. Statist. Assoc. 105 978–990.
  • Tang, B., Hsieh, W., Monahan, A. and Tangang, F. (2000). Skill comparisons between neural networks and canonical correlation analysis in predicting the equatorial Pacific sea surface temperatures. Journal of Climate 13 287–293.
  • Tangang, F., Tang, B., Monahan, A. and Hsieh, W. (1998). Forecasting ENSO events: A neural network-extended EOF approach. Journal of Climate 11 29–41.
  • Tarantola, A. (1987). Inverse Problem Theory: Methods for Data Fitting and Model Parameter Estimation. Elsevier Science Publishers B.V., Amsterdam.
  • Tebaldi, C. and Sansó, B. (2009). Joint projections of temperature and precipitation change from multiple climate models: A hierarchical Bayesian approach. J. Roy. Statist. Soc. Ser. A 172 83–106.
  • Tebaldi, C., Smith, R., Nychka, D. and Mearns, L. (2005). Quantifying uncertainty in projections of regional climate change: A Bayesian approach to the analysis of multimodel ensembles. Journal of Climate 18 1524–1540.
  • Thompson, G. (1992). A Bayesian approach to management advice when stock-recruitment parameters are uncertain. Fishery Bulletin 90 561–573.
  • Timmermann, A., Voss, H. and Pasmanter, R. (2001). Empirical dynamical system modeling of ENSO using nonlinear inverse techniques. Journal of Physical Oceanography 31 1579–1598.
  • Vallis, G. (2006). Atmospheric and Oceanic Fluid Dynamics: Fundamentals and Large-Scale Circulation. Cambridge Univ. Press, Cambridge.
  • van der Merwe, R., Leen, T. K., Lu, Z., Frolov, S. and Baptista, A. M. (2007). Fast neural network surrogates for very high dimensional physics-based models in computational oceanography. Neural Netw. 20 462–478.
  • van Oldenborgh, G. J., Balmaseda, M., Ferranti, L., Stockdale, T. and Anderson, D. (2005). Did the ECMWF seasonal forecast model outperform statistical ENSO forecast models over the last 15 years? Journal of Climate 18 3240–3249.
  • Ver Hoef, J. M. and Jansen, J. K. (2007). Space–time zero-inflated count models of harbor seals. Environmetrics 18 697–712.
  • Verbeke, G. and Molenberghs, G. (2009). Linear Mixed Models for Longitudinal Data. Springer, New York.
  • Von Storch, H. and Zwiers, F. (2002). Statistical Analysis in Climate Research. Cambridge Univ. Press, Cambridge.
  • Wheeler, M. and Kiladis, G. N. (1999). Convectively coupled equatorial waves: Analysis of clouds and temperature in the wavenumber-frequency domain. J. Atmospheric Sci. 56 374–399.
  • Wikle, C. and Anderson, C. (2003). Climatological analysis of tornado report counts using a hierarchical Bayesian spatiotemporal model. J. Geophys. Res 108 9005.
  • Wikle, C. K. and Berliner, L. M. (2007). A Bayesian tutorial for data assimilation. Phys. D 230 1–16.
  • Wikle, C. K. and Holan, S. H. (2011). Polynomial nonlinear spatio-temporal integro-difference equation models. J. Time Series Anal. 32 339–350.
  • Wikle, C. K. and Hooten, M. B. (2010). A general science-based framework for dynamical spatio-temporal models. TEST 19 417–451.
  • Wikle, C., Milliff, R. and Large, W. (1999). Surface wind variability on spatial scales from 1 to 1000 km observed during TOGA COARE. J. Atmospheric Sci. 56 2222–2231.
  • Wikle, C. K., Milliff, R. F., Nychka, D. and Berliner, L. M. (2001). Spatiotemporal hierarchical Bayesian modeling: Tropical ocean surface winds. J. Amer. Statist. Assoc. 96 382–397.
  • Wilks, D. (2011). Statistical Methods in the Atmospheric Sciences 100. Academic Press, San Diego, CA.
  • Wunsch, C. (1994). Dynamically consistent hydrography and absolute velocity in the eastern North Atlantic Ocean. J. Geophys. Res. 99 14071–14090.
  • Wunsch, C. (1996). The Ocean Circulation Inverse Problem. Cambridge Univ. Press, Cambridge.
  • Zhang, H. M. and Hogg, N. (1992). Circulation and water mass balance in the Brazil Basin. J. Mar. Res. 50 385–420.
  • Zika, J., McDougall, T. and Sloyan, B. (2010). A tracer-contour inverse method for estimating ocean circulation and mixing. J. Phys. Ocean. 40 26–47.