December 2024 A latent variable mixture model for composition-on-composition regression with application to chemical recycling
Nicholas Rios, Lingzhou Xue, Xiang Zhan
Author Affiliations +
Ann. Appl. Stat. 18(4): 3253-3273 (December 2024). DOI: 10.1214/24-AOAS1935

Abstract

It is quite common to encounter compositional data in a regression framework in data analysis. When both responses and predictors are compositional, most existing models rely on a family of log-ratio based transformations to move the analysis from the simplex to the reals. This often makes the interpretation of the model more complex. A transformation-free regression model was recently developed, but it only allows for a single compositional predictor. However, many datasets include multiple compositional predictors of interest. Motivated by an application to hydrothermal liquefaction (HTL) data, a novel extension of this transformation-free regression model is provided that allows for two (or more) compositional predictors to be used via a latent variable mixture. A modified expectation-maximization algorithm is proposed to estimate model parameters, which are shown to have natural interpretations. Conformal inference is used to obtain prediction limits on the compositional response. The resulting methodology is applied to the HTL dataset. Extensions to multiple predictors are discussed.

Funding Statement

Nicholas Rios and Lingzhou Xue were supported in part by the Natural Science Foundation grants (DMS-2210775, CCF-2007823, DMS-1953189) and the National Institute of General Medical Sciences grant (1R01GM152812).
Xiang Zhan was supported in part by the National Natural Science Foundation of China (grant no. 12371287).

Acknowledgments

The authors would like to thank the referees, the Associate Editor, and the Editor for constructive comments that improved this article. Xiang Zhan is the corresponding author.

Citation

Download Citation

Nicholas Rios. Lingzhou Xue. Xiang Zhan. "A latent variable mixture model for composition-on-composition regression with application to chemical recycling." Ann. Appl. Stat. 18 (4) 3253 - 3273, December 2024. https://doi.org/10.1214/24-AOAS1935

Information

Received: 1 August 2023; Revised: 1 June 2024; Published: December 2024
First available in Project Euclid: 31 October 2024

Digital Object Identifier: 10.1214/24-AOAS1935

Keywords: Compositional data , EM-algorithm , Kullback–Leibler distance , transformation-free

Rights: Copyright © 2024 Institute of Mathematical Statistics

Vol.18 • No. 4 • December 2024
Back to Top