Nitrogen dioxide () is a primary constituent of traffic-related air pollution and has well-established harmful environmental and human-health impacts. Knowledge of the spatiotemporal distribution of is critical for exposure and risk assessment. A common approach for assessing air pollution exposure is linear regression involving spatially referenced covariates, known as land-use regression (LUR). We develop a scalable approach for simultaneous variable selection and estimation of LUR models with spatiotemporally correlated errors, by combining a general-Vecchia Gaussian-process approximation with a penalty on the LUR coefficients. In comparison to existing methods using simulated data, our approach resulted in higher model-selection specificity and sensitivity and in better prediction in terms of calibration and sharpness, for a wide range of relevant settings. In our spatiotemporal analysis of daily, US-wide, ground-level data, our approach was more accurate, and produced a sparser and more interpretable model. Our daily predictions elucidate spatiotemporal patterns of concentrations across the United States, including significant variations between cities and intra-urban variation. Thus, our predictions will be useful for epidemiological and risk-assessment studies seeking daily, national-scale predictions, and they can be used in acute-outcome health-risk assessments.
Messier’s research was partially conducted while at Oregon State University, Department of Environmental and Molecular Toxicology, and supported by NIEHS K99 ES029523. Messier is currently supported by NIH institutes NIEHS/NTP and NIMHD as an intramural investigator. Katzfuss’ research was partially supported by National Science Foundation (NSF) Grants DMS-1654083 and DMS-1953005.
The authors would like to thank Shahzad Gani, Jianhua Huang, Irina Gaynanova, Anirban Bhattacharya and Joe Guinness for helpful comments and suggestions.
Simulations were run on computing resources at the Oregon State University Center for Genome Research and Biocomputing and the NIEHS/NTP Office of Data Science computing cluster.
"Scalable penalized spatiotemporal land-use regression for ground-level nitrogen dioxide." Ann. Appl. Stat. 15 (2) 688 - 710, June 2021. https://doi.org/10.1214/20-AOAS1422