The Annals of Applied Statistics

Compositional mediation analysis for microbiome studies

Michael B. Sohn and Hongzhe Li

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


Motivated by recent advances in causal mediation analysis and problems in the analysis of microbiome data, we consider the setting where the effect of a treatment on an outcome is transmitted through perturbing the microbial communities or compositional mediators. The compositional and high-dimensional nature of such mediators makes the standard mediation analysis not directly applicable to our setting. We propose a sparse compositional mediation model that can be used to estimate the causal direct and indirect (or mediation) effects utilizing the algebra for compositional data in the simplex space. We also propose tests of total and component-wise mediation effects. We conduct extensive simulation studies to assess the performance of the proposed method and apply the method to a real microbiome dataset to investigate an effect of fat intake on body mass index mediated through the gut microbiome.

Article information

Ann. Appl. Stat., Volume 13, Number 1 (2019), 661-681.

Received: October 2016
Revised: June 2018
First available in Project Euclid: 10 April 2019

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Compositional algebra 16S sequencing causal mediation effect simplex space


Sohn, Michael B.; Li, Hongzhe. Compositional mediation analysis for microbiome studies. Ann. Appl. Stat. 13 (2019), no. 1, 661--681. doi:10.1214/18-AOAS1210.

Export citation


  • Aitchison, J. (1982). The statistical analysis of compositional data. J. Roy. Statist. Soc. Ser. B 44 139–177.
  • Aitchison, J. (1986). The Statistical Analysis of Compositional Data. CRC Press, London.
  • Aitchison, J. and Bacon-Shone, J. (1984). Log contrast models for experiments with mixtures. Biometrika 71 323–330.
  • Baron, R. M. and Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. J. Pers. Soc. Psychol. 51 1173–1182.
  • Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Statist. 29 1165–1188.
  • Billheimer, D., Guttorp, P. and Fagan, W. F. (2001). Statistical interpretation of species composition. J. Amer. Statist. Assoc. 96 1205–1214.
  • Bollen, K. A. (1987). Total, direct, and indirect effects in structural equation models. Sociol. Method. 17 37–69.
  • Bray, G. A. and Popkin, B. M. (1998). Dietary fat intake does affect obesity! Am. J. Clin. Nutr. 68 1157–1173.
  • Bühlmann, P. and van de Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, Heidelberg.
  • Chén, O. Y., Crainiceanu, C., Ogburn, E. L., Caffo, B. S., Wager, T. D. and Lindquist, M. A. (2018). High-dimensional multivariate mediation with application to neuroimaging data. Biostatistics 19 121–136.
  • Daniel, H., Gholami, A. M., Berry, D., Desmarchelier, C., Hahne, H., Loh, G., Mondot, S., Lepage, P., Rothballer, M., Walker, A., Böhm, C., Wenning, M., Wagner, M., Blaut, M., Schmitt-Kopplin, P., Kuster, B., Haller, D. and Clavel, T. (2014). High-fat diet alters gut microbiota physiology in mice. ISEM J. 8 295–308.
  • Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Monographs on Statistics and Applied Probability 57. CRC Press, New York.
  • Hu, F. B., Rimm, E., Smith-Warner, S. A., Feskanich, D., Stampfer, M. J., Ascherio, A., Sampson, L. and Willett, W. C. (1999). Reproducibility and validity of dietary patterns assessed with a food-frequency questionnaire. Am. J. Clin. Nutr. 69 243–249.
  • Huang, Y.-T. and Pan, W.-C. (2016). Hypothesis test of mediation effect in causal mediation model with high-dimensional continuous mediators. Biometrics 72 401–413.
  • Imai, K., Keele, L. and Tingley, D. (2010). A general approach to causal mediation analysis. Psychol. Methods 15 309–334.
  • Imai, K., Keele, L. and Yamamoto, T. (2010). Identification, inference and sensitivity analysis for causal mediation effects. Statist. Sci. 25 51–71.
  • Imai, K. and Yamamoto, T. (2013). Identification and sensitivity analysis for multiple causal mechanisms: Revisiting evidence from framing experiments. Polit. Anal. 21 141–171.
  • Imbens, G. W. and Rubin, D. B. (2015). Causal Inference—For Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge Univ. Press, New York.
  • Lam, Y. Y., Ha, C. W. Y., Campbell, C. R., Mitchell, A. J., Dinudom, A., Oscarsson, J., Cook, D. I., Hunt, N. H., Caterson, I. D., Holmes, A. J. and Storlien, L. H. (2012). Increased gut permeability and microbiota change associate with mesenteric fat inflammation and metabolic dysfunction in diet-induced obese mice. PLoS ONE 7 e34233.
  • Ley, R. E., Turnbaugh, P. J., Klein, S. and Gordon, J. I. (2006). Human gut microbes associated with obesity. Nature 444 1022–1023.
  • Lin, W., Shi, P., Feng, R. and Li, H. (2014). Variable selection in regression with compositional covariates. Biometrika 101 785–797.
  • Machado, J. A. F. and Parente, P. (2005). Bootstrap estimation of covariance matrices via the percentile method. Econom. J. 8 70–78.
  • MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M., West, S. G. and Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychol. Methods 7 83–104.
  • Maier, M. J. (2014). DirichletReg: Dirichlet regression for compositional data in R. Research Report Series, Dept. Statistics and Mathematics, 125. WU Vienna Univ. Economics and Business, Vienna.
  • Pearl, J. (2000). Causality: Models, Reasoning, and Inference. Cambridge Univ. Press, Cambridge.
  • Pearl, J. (2001). Direct and indirect effects. In Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence 411–420. Morgan Kaufmann, San Francisco, CA.
  • Preacher, K. J. and Hayes, A. F. (2008). Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behav. Res. Methods 40 879–891.
  • Rubin, D. B. (2005). Causal inference using potential outcomes: Design, modeling, decisions. J. Amer. Statist. Assoc. 100 322–331.
  • Shi, P., Zhang, A. and Li, H. (2016). Regression analysis for microbiome compositional data. Ann. Appl. Stat. 10 1019–1040.
  • Shrout, P. E. and Bolger, N. (2002). Mediation in experimental and nonexperimental studies: New procedures and recommendations. Psychol. Methods 7 422–445.
  • Sobel, M. E. (1982). Asymptotic confidence intervals for indirect effects in structural equation models. Sociol. Method. 13 290–312.
  • Sohn, M. B. and Li, H. (2019). Supplement to “Compositional mediation analysis for microbiome studies.” DOI:10.1214/18-AOAS1210SUPP.
  • Teixeira, T. F., Collado, M. C., Ferreira, C. L., Bressan, J. and Peluzio, M. C. (2012). Potential mechanisms for the emerging link between obesity and increased intestinal permeability. Nutr. Res. 32 637–47.
  • Turnbaugh, P. J., Ley, R. E., Mahowald, M. A., Magrini, V., Mardis, E. R. and Gordon, J. I. (2006). An obesity-associated gut microbiome with increased capacity for energy harvest. Nature 444 1027–1031.
  • VanderWeele, T. J. and Vansteelandt, S. (2010). Odds ratios for mediation analysis for a dichotomous outcome. Am. J. Epidemiol. 172 1339–1348.
  • VanderWeele, T. J. and Vansteelandt, S. (2014). Mediation analysis with multiple mediators. Epidemiol. Methods 2 95–115.
  • Winship, C. and Mare, R. D. (1983). Structural equations and path analysis for discrete data. Amer. J. Sociol. 89 54–110.
  • Wu, G., Chen, J., Hoffmann, C., Bittinger, K., Chen, Y. Y., Keilbaugh, S. A., Bewtra, M., Knights, D., Walters, W. A., Knight, R., Sinha, R., Gilroy, E., Gupta, K., Baldassano, R., Nessel, L., Li, H., Bushman, F. D. and Lewis, J. D. (2011). Linking long-term dietary patterns with gut microbial enterotypes. Science 334 105–108.
  • Zhao, Y. and Luo, X. (2016). Pathway lasso: Estimate and select sparse mediation pathways with high dimensional mediators. arXiv:1603.07749.

Supplemental materials

  • Supplement to “Compositional mediation analysis for microbiome studies”. The online Supplemental Materials include proofs of Theorem 1 and Proposition 1, a detailed computational algorithm for the covariance matrix of composition parameters, variance calculation for the indirect effects, an extension of the model to allow for interactions between a treatment and mediators, and a method for sensitivity analysis.