The Annals of Applied Statistics

Practical large-scale spatio-temporal modeling of particulate matter concentrations

Christopher J. Paciorek, Jeff D. Yanosky, Robin C. Puett, Francine Laden, and Helen H. Suh

Full-text: Open access


The last two decades have seen intense scientific and regulatory interest in the health effects of particulate matter (PM). Influential epidemiological studies that characterize chronic exposure of individuals rely on monitoring data that are sparse in space and time, so they often assign the same exposure to participants in large geographic areas and across time. We estimate monthly PM during 1988–2002 in a large spatial domain for use in studying health effects in the Nurses’ Health Study. We develop a conceptually simple spatio-temporal model that uses a rich set of covariates. The model is used to estimate concentrations of PM10 for the full time period and PM2.5 for a subset of the period. For the earlier part of the period, 1988–1998, few PM2.5 monitors were operating, so we develop a simple extension to the model that represents PM2.5 conditionally on PM10 model predictions. In the epidemiological analysis, model predictions of PM10 are more strongly associated with health effects than when using simpler approaches to estimate exposure.

Our modeling approach supports the application in estimating both fine-scale and large-scale spatial heterogeneity and capturing space–time interaction through the use of monthly-varying spatial surfaces. At the same time, the model is computationally feasible, implementable with standard software, and readily understandable to the scientific audience. Despite simplifying assumptions, the model has good predictive performance and uncertainty characterization.

Article information

Ann. Appl. Stat. Volume 3, Number 1 (2009), 370-397.

First available in Project Euclid: 16 April 2009

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Additive model air pollution epidemiology geoadditive model smoothing kriging backfitting stochastic EM


Paciorek, Christopher J.; Yanosky, Jeff D.; Puett, Robin C.; Laden, Francine; Suh, Helen H. Practical large-scale spatio-temporal modeling of particulate matter concentrations. Ann. Appl. Stat. 3 (2009), no. 1, 370--397. doi:10.1214/08-AOAS204.

Export citation


  • Brauer, M., Hoek, G., van Vliet, P., Meliefste, K., Fischer, P., Gehring, U., Heinrich, J., Cyrys, J., Bellander, T., Lewne, M. and Brunekreef, B. (2003). Estimating long-term average particulate air pollution concentrations: Application of traffic indicators and geographic information systems. Epidemiology 14 228–239.
  • Briggs, D. J., de Hoogh, C., Gulliver, J., Wills, J., Elliott, P., Kingham, S. and Smallbone, K. (2000). A regression-based method for mapping traffic-related air pollution: Application and testing in four contrasting urban environments. Science of the Total Environment 253 151–167.
  • Burton, R. M., Suh, H. H. and Koutrakis, P. (1996). Characterization of outdoor particle concentrations within metropolitan Philadelphia. Environmental Science and Technology 30 400–407.
  • Calder, C. A. (2008). A dynamic process convolution approach to modeling ambient particulate matter concentrations. Environmetrics 19 39–48.
  • Colditz, G. A. and Hankinson, S. (2005). The Nurses’ Health Study: Lifestyle and health among women. Nature Reviews Cancer 5 388–396.
  • Cressie, N. A. C. (1993). Statistics for Spatial Data, Rev. ed. Wiley, New York.
  • Daniels, M. J., Zhou, Z. and Zou, H. (2006). Conditionally specified space–time models for multivariate processes. J. Comput. Graph. Statist. 15 157–177.
  • Dockery, D. W., Pope III, C. A., Xu, X., Spengler, J. D., Ware, J. H., Fay, M. E., Ferris, B. G. and Speizer, F. E. (1993). An association between air pollution and mortality in six U.S. cities. New England Journal of Medicine 329 1753–1759.
  • Draper, D. and Krnjacic, M. (2006). Bayesian model specification. Technical report, Dept. Applied Mathematics and Statistics, Univ. California Santa Cruz.
  • Fuentes, M. and Raftery, A. E. (2005). Model evaluation and spatial interpolation by Bayesian combination of observations with outputs from numerical models. Biometrics 61 36–45.
  • Gneiting, T. (2002). Nonseparable, stationary covariance functions for space–time data. J. Amer. Statist. Assoc. 97 590–600.
  • Gryparis, A., Paciorek, C. J., Zeka, A., Schwartz, J. and Coull, B. A. (2009). Measurement error caused by spatial misalignment in environmental epidemiology. Biostatistics 10 258–274.
  • Hastie, T., Tibshirani, R. and Friedman, J. H. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York.
  • Jerrett, M., Burnett, R. T., Ma, R., Pope III, C., Krewski, D., Newbold, K. B., Thurston, G., Shi, Y., Finkelstein, N., Calle, E. and Thun, M. (2005). Spatial analysis of air pollution and mortality in Los Angeles. Epidemiology 16 727–736.
  • Kammann, E. E. and Wand, M. P. (2003). Geoadditive models. Appl. Statist. 52 1–18.
  • Golam Kibria, B. M., Sun, L., Zidek, J. V. and Le, N. D. (2002). Bayesian spatial prediction of random space–time fields with application to mapping PM2.5 exposure. J. Amer. Statist. Assoc. 97 112–124.
  • Künzli, N., Jerrett, M., Mack, W. J., Beckerman, B., LaBree, L., Gilliland, F., Thomas, D., Peters, J. and Hodis, H. (2005). Ambient air pollution and atherosclerosis in Los Angeles. Environmental Health Perspectives 113 201–206.
  • Liu, Y., Sarnat, J. A., Kilaru, V., Jacob, D. J. and Koutrakis, P. (2005). Estimating ground-level PM2.5 in the eastern United States using satellite remote sensing. Environmental Science and Technology 39 3269–3278.
  • McMillan, N. J., Holland, D. M., Morara, M. and Feng, J. (2008). Combining numerical model output and particulate data using Bayesian space–time modeling. Environmetrics. To appear.
  • Miller, K. A., Siscovick, D. S., Sheppard, L., Shepherd, K., Sullivan, J. H., Anderson, G. L. and Kaufman, J. (2007). Long-term exposure to air pollution and incidence of cardiovascular events in women. New England Journal of Medicine 356 447–459.
  • Ozkaynak, H., Schatz, A. D., Thurston, G. D., Isaacs, R. G. and Husar, R. B. (1985). Relationships between aerosol extinction coefficients derived from airport visual range observations and alternative measures of airborne particle mass. J. Air Pollution Control Association 35 1176–1185.
  • Paciorek, C. J., Yanosky, J. D., Puett, R. C., Laden, F. and Suh, H. H. (2009). Supplement to “Practical large-scale spatio-temporal modeling of particulate matter concentrations.” DOI: 10.1214/08-AOAS204SUPPA, DOI: 10.1214/08-AOAS204SUPPB.
  • Pope III, C. A., Burnett, R. T., Thun, M. J., Calle, E. E., Krewski, D., Ito, K. and Thurston, G. (2002). Lung cancer, cardiopulmonary mortality and long-term exposure to fine particulate air pollution. J. Amer. Med. Assoc. 287 1132–1141.
  • Pope III, C. A., Thun, M. J., Namboodiri, M. M., Dockery, D. W., Evans, J. S., Speizer, F. E. and Heath Jr., C. W. (1995). Particulate air pollution as a predictor of mortality in a prospective study of U.S. adults. American Journal of Respiratory and Critical Care Medicine 151 669–674.
  • Puett, R. C., Schwartz, J., Hart, J. E., Yanosky, J. D., Speizer, F. E., Suh, H. H., Paciorek, C. J., Neas, L. and Laden, F. (2008). Chronic particulate exposure, mortality and cardiovascular outcomes in the Nurses’ Health Study. American Journal of Epidemiology 168 1161–1168.
  • Ruppert, D., Wand, M. P. and Carroll, R. J. (2003). Semiparametric Regression. Cambridge Univ. Press, Cambridge, U.K.
  • Sahu, S. K., Gelfand, A. E. and Holland, D. M. (2006). Spatio-temporal modeling of fine particulate matter. J. Agric. Biol. Environ. Statist. 11 61–86.
  • Schabenberger, O. and Gotway, C. (2005). Statistical Methods for Spatial Data Analysis. Chapman & Hall, Boca Raton.
  • Smith, R. L., Kolenikov, S. and Cox, L. H. (2003). Spatiotemporal modeling of PM2.5 data with missing values. Journal of Geophysical Research 108 D9004.
  • Stein, M. L. and Fang, D. (1997). Discussion of ozone exposure and population density in Harris County, Texas, by R. J. Carroll et al. J. Amer. Statist. Assoc. 92 408–411.
  • van de Kassteele, J. and Stein, A. (2006). A model for external drift kriging with uncertain covariates applied to air quality measurements and dispersion model output. Environmetrics 17 309–322.
  • Wood, S. N. (2003). Thin plate regression splines. J. Roy. Statist. Soc. Ser. B 65 95–114.
  • Wood, S. N. (2004). Stable and efficient multiple smoothing parameter estimation for generalized additive models. J. Amer. Statist. Assoc. 99 673–686.
  • Wood, S. N. (2006). Generalized Additive Models: An Introduction with R. Chapman & Hall, Boca Raton.
  • Yanosky, J. D., Paciorek, C. J., Schwartz, J., Laden, F., Puett, R. and Suh, H. H. (2008a). Spatio-temporal modeling of chronic PM10 exposure for the Nurses’ Health Study. Atmospheric Environment 42 4047–4062.
  • Yanosky, J. D., Paciorek, C. J. and Suh, H. H. (2008b). Predicting chronic fine and coarse particulate exposures using spatio-temporal models for the northeastern and midwestern U.S. Environmental Health Perspectives 117 522–529. DOI: 10.1289/ehp.11692.

Supplemental materials