## The Annals of Applied Statistics

- Ann. Appl. Stat.
- Volume 11, Number 3 (2017), 1617-1648.

### Dynamic mixtures of factor analyzers to characterize multivariate air pollutant exposures

Antonello Maruotti, Jan Bulla, Francesco Lagona, Marco Picone, and Francesca Martella

#### Abstract

The assessment of pollution exposure is based on the analysis of a multivariate time series that include the concentrations of several pollutants as well as the measurements of multiple atmospheric variables. It typically requires methods of dimensionality reduction that are capable of identifying potentially dangerous combinations of pollutants and simultaneously segmenting exposure periods according to air quality conditions. When the data are high-dimensional, however, efficient methods of dimensionality reduction are challenging because of the formidable structure of cross-correlations that arise from the dynamic interaction between weather conditions and natural/anthropogenic pollution sources. In order to assess pollution exposure in an urban area while taking the above mentioned difficulties into account, we have developed a class of parsimonious hidden Markov models. In a multivariate time series setting, this approach simultaneously allows for the performance of temporal segmentation and dimensionality reduction. We specifically approximate the distribution of multiple pollutant concentrations by mixtures of factor analysis models, whose parameters evolve according to a latent Markov chain. Covariates are included as predictors of the chain transition probabilities. Parameter constraints on the factorial component of the model are exploited to tune the flexibility of dimensionality reduction. In order to estimate the model parameters efficiently, we have proposed a novel three-step Alternating Expected Conditional Maximization (AECM) algorithm, which is also assessed in a simulation study. In the case study, the proposed methods could (1) describe the exposure to pollution in terms of a few latent regimes, (2) associate these regimes with specific combinations of pollutant concentration levels as well as distinct correlation structures between concentrations, and (3) capture the influence of weather conditions on transitions between regimes.

#### Article information

**Source**

Ann. Appl. Stat., Volume 11, Number 3 (2017), 1617-1648.

**Dates**

Received: November 2016

Revised: March 2017

First available in Project Euclid: 5 October 2017

**Permanent link to this document**

https://projecteuclid.org/euclid.aoas/1507168842

**Digital Object Identifier**

doi:10.1214/17-AOAS1049

**Mathematical Reviews number (MathSciNet)**

MR3709572

**Zentralblatt MATH identifier**

1380.62265

**Keywords**

Hidden Markov models AECM algorithm dimensionality reduction three-step algorithm

#### Citation

Maruotti, Antonello; Bulla, Jan; Lagona, Francesco; Picone, Marco; Martella, Francesca. Dynamic mixtures of factor analyzers to characterize multivariate air pollutant exposures. Ann. Appl. Stat. 11 (2017), no. 3, 1617--1648. doi:10.1214/17-AOAS1049. https://projecteuclid.org/euclid.aoas/1507168842