The Annals of Applied Statistics

A new multivariate measurement error model with zero-inflated dietary data, and its application to dietary assessment

Saijuan Zhang, Douglas Midthune, Patricia M. Guenther, Susan M. Krebs-Smith, Victor Kipnis, Kevin W. Dodd, Dennis W. Buckman, Janet A. Tooze, Laurence Freedman, and Raymond J. Carroll

Full-text: Open access


In the United States the preferred method of obtaining dietary intake data is the 24-hour dietary recall, yet the measure of most interest is usual or long-term average daily intake, which is impossible to measure. Thus, usual dietary intake is assessed with considerable measurement error. Also, diet represents numerous foods, nutrients and other components, each of which have distinctive attributes. Sometimes, it is useful to examine intake of these components separately, but increasingly nutritionists are interested in exploring them collectively to capture overall dietary patterns. Consumption of these components varies widely: some are consumed daily by almost everyone on every day, while others are episodically consumed so that 24-hour recall data are zero-inflated. In addition, they are often correlated with each other. Finally, it is often preferable to analyze the amount of a dietary component relative to the amount of energy (calories) in a diet because dietary recommendations often vary with energy level. The quest to understand overall dietary patterns of usual intake has to this point reached a standstill. There are no statistical methods or models available to model such complex multivariate data with its measurement error and zero inflation. This paper proposes the first such model, and it proposes the first workable solution to fit such a model. After describing the model, we use survey-weighted MCMC computations to fit the model, with uncertainty estimation coming from balanced repeated replication. The methodology is illustrated through an application to estimating the population distribution of the Healthy Eating Index-2005 (HEI-2005), a multi-component dietary quality index involving ratios of interrelated dietary components to energy, among children aged 2–8 in the United States. We pose a number of interesting questions about the HEI-2005 and provide answers that were not previously within the realm of possibility, and we indicate ways that our approach can be used to answer other questions of importance to nutritional science and public health.

Article information

Ann. Appl. Stat., Volume 5, Number 2B (2011), 1456-1487.

First available in Project Euclid: 13 July 2011

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bayesian methods dietary assessment Latent variables measurement error mixed models nutritional epidemiology nutritional surveillance zero-inflated data


Zhang, Saijuan; Midthune, Douglas; Guenther, Patricia M.; Krebs-Smith, Susan M.; Kipnis, Victor; Dodd, Kevin W.; Buckman, Dennis W.; Tooze, Janet A.; Freedman, Laurence; Carroll, Raymond J. A new multivariate measurement error model with zero-inflated dietary data, and its application to dietary assessment. Ann. Appl. Stat. 5 (2011), no. 2B, 1456--1487. doi:10.1214/10-AOAS446.

Export citation


  • Buonaccorsi, J. P. (2010). Measurement Error: Models, Methods, and Applications. CRC Press, Boca Raton, FL.
  • Carriquiry, A. L. (1999). Assessing the prevalence of nutrient inadequacy. Public Health Nutrition 2 23–33.
  • Carriquiry, A. L. (2003). Estimation of usual intake distributions of nutrients and foods. Journal of Nutrition 133 601–608.
  • Carroll, R. J., Ruppert, D., Stefanski, L. A. and Crainiceanu, C. M. (2006). Measurement Error in Nonlinear Models, 2nd ed. Monographs on Statistics and Applied Probability 105. Chapman & Hall/CRC, Boca Raton, FL.
  • Delaigle, A. (2008). An alternative view of the deconvolution problem. Statist. Sinica 18 1025–1045.
  • Delaigle, A. and Hall, P. (2008). Using SIMEX for smoothing-parameter choice in errors-in-variables problems. J. Amer. Statist. Assoc. 103 280–287.
  • Delaigle, A., Hall, P. and Meister, A. (2008). On deconvolution with repeated measurements. Ann. Statist. 36 665–685.
  • Delaigle, A. and Hall, P. (2011). Estimation of observation-error variance in errors-in-variables regression. Statist. Sinica. To appear.
  • Delaigle, A. and Meister, A. (2008). Density estimation with heteroscedastic error. Bernoulli 14 562–579.
  • Ferrari, P., Roddam, A., Fahey, M. T., Jenab, M., Bamia, C., Ocké, M., Amiano, P., Hjartåker, A., Biessy, C., Rinaldi, S., Huybrechts, I., Tjønneland, A., Dethlefsen, C., Niravong, M., Clavel-Chapelon, F., Linseisen, J., Boeing, H., Oikonomou, E., Orfanos, P., Palli, D., Santucci de Magistris, M., Bueno-de Mesquita, H. B., Peeters, P. H., Parr, C. L., Braaten, T., Dorronsoro, M., Berenguer, T., Gullberg, B., Johansson, I., Welch, A. A., Riboli, E., Bingham, S. and Slimani, N. (2009). A bivariate measurement error model for nitrogen and potassium intakes to evaluate the performance of regression calibration in the European Prospective Investigation into Cancer and Nutrition study. European Journal of Clinical Nutrition 63 Supplement 4 S179–S187.
  • Flegal, J. M., Haran, M. and Jones, G. L. (2008). Markov chain Monte Carlo: Can we trust the third significant figure? Statist. Sci. 23 250–260.
  • Fraser, G. E. and Shavlik, D. J. (2004). Correlations between estimated and true dietary intakes. Ann. Epidemiol. 14 287–295.
  • Freedman, L. S., Guenther, P. M., Krebs-Smith, S. M., Dodd, K. W. and Midthune, D. (2010). A population’s distribution of Healthy Eating Index-2005 component scores can be estimated when more than one 24-hour recall is available. J. Nutr. 140 1529–1534.
  • Fuller, W. A. (1987). Measurement Error Models. Wiley, New York.
  • Fungwe, T., Guenther, P. M., Juan, W. Y., Hiza, H. and Lino, M. (2009). The quality of children’s diets in 2003–04 as measured by the Healthy Eating Index-2005. In Nutrition Insight 43. USDA Center for Nutrition Policy and Promotion.
  • Guenther, P. M., Reedy, J. and Krebs-Smith, S. M. (2008). Development of the Healthy Eating Index-2005. Journal of the American Dietetic Association 108 1896–1901.
  • Guenther, P. M., Reedy, J., Krebs-Smith, S. M. and Reeve, B. B. (2008). Evaluation of the Healthy Eating Index-2005. Journal of the American Dietetic Association 108 1854–1864.
  • Guolo, A. (2008). A flexible approach to measurement error correction in casecontrol studies. Biometrics 64 1207–1214.
  • Gustafson, P. (2004). Measurement Error and Misclassification in Statistics and Epidemiology: Impacts and Bayesian Adjustments. Chapman & Hall/CRC, Boca Raton, FL.
  • Kipnis, V., Midthune, D., Buckman, D. W., Dodd, K. W., Guenther, P. M., Krebs-Smith, S. M., Subar, A. F., Tooze, J. A., Carroll, R. J. and Freedman, L. S. (2009). Modeling data with excess zeros and measurement error: Application to evaluating relationships between episodically consumed foods and health outcomes. Biometrics 65 1003–1010.
  • Kipnis, V., Freedman, L. S., Carroll, R. J. and Midthune, D. (2011). A measurement error model for episodically consumed foods and energy. Preprint.
  • Kott, P. S., Guenther, P. M., Wagstaff, D. A., Juan, W. Y. and Kranz, S. (2009). Fitting a linear model to survey data when the long-term average daily intake of a dietary component is an explanatory variable. Survey Research Methods 3 157–165.
  • Küchenhoff, H., Mwalili, S. M. and Lesaffre, E. (2006). A general method for dealing with misclassification in regression: The misclassification SIMEX. Biometrics 62 85–96, 315–316.
  • Lehmann, E. L. and Casella, G. (1998). Theory of Point Estimation, 2nd ed. Springer, New York.
  • Liang, H., Thurston, S. W., Ruppert, D., Apanasovich, T. and Hauser, R. (2008). Additive partial linear models with measurement errors. Biometrika 95 667–678.
  • Messer, K. and Natarajan, L. (2008). Maximum likelihood, multiple imputation and regression calibration for measurement error adjustment. Stat. Med. 27 6332–6350.
  • Natarajan, L. (2009). Regression calibration for dichotomized mismeasured predictors. Int. J. Biostat. 5 Art. 1143, 27.
  • Nusser, S. M., Fuller, W. A. and Guenther, P. M. (1997). Estimating usual dietary intake distributions: Adjusting for measurement error and nonnormality in 24-hour food intake data. In Survey Measurement and Process Quality ( L. Lyberg, P. Biemer, M. Collins, E. Deleeuw, C. Dippo, N. Schwartz and D. Trewin, eds.) 670–689. Wiley, New York.
  • Nusser, S. M., Carriquiry, A. L., Dodd, K. W. and Fuller, W. A. (1996). A semiparametric approach to estimating usual intake distributions. J. Amer. Statist. Assoc. 91 1440–1449.
  • Prentice, R. L. (1996). Measurement error and results from analytic epidemiology: Dietary fat and breast cancer. J. Natl. Cancer Inst. 88 1738–1747.
  • Prentice, R. L. (2003). Dietary assessment and the reliability of nutritional epidemiology reports. Lancet 362 182–183.
  • Robert, C. P. (1995). Simulation of truncated normal variables. Statistics and Computing 5 121–125.
  • Staudenmayer, J., Ruppert, D. and Buonaccorsi, J. P. (2008). Density estimation in the presence of heteroscedastic measurement error. J. Amer. Statist. Assoc. 103 726–736.
  • Tooze, J. A., Grunwald, G. K. and Jones, R. H. (2002). Analysis of repeated measures data with clumping at zero. Stat. Methods Med. Res. 11 341–355.
  • Tooze, J. A., Midthune, D., Dodd, K. W., Freedman, L. S., Krebs-Smith, S. M., Subar, A. F., Guenther, P. M., Carroll, R. J. and Kipnis, V. (2006). A new statistical method for estimating the usual intake of episodically consumed foods with application to their distribution. J. Am. Diet. Assoc. 106 1575–1587.
  • Wand, M. P. (1998). Finite sample performance of deconvolving density estimators. Statist. Probab. Lett. 37 131–139.
  • Wolter, K. M. (1995). Introduction to Variance Estimation. Springer, New York.
  • Zhang, S., Midthune, D., Pérez, A., Buckman, D. W., Kipnis, V., Freedman, L. S., Dodd, K. W., Krebs-Smith, S. M. and Carroll, R. J. (2011). Fitting a bivariate measurement error model for episodically consumed dietary components. International Journal of Biostatistics 7 (1) Article 1.
  • Zhang, S., Midthune, D., Guenther, P. M., Krebs-Smith, S. M., Kipnis, V., Dodd, K. W., Buckman, D. W., Tooze, J. A., Freedman, L. S. and Carroll, R. J. (2011). Supplement to “A new multivariate measurement error model with zero-inflated dietary data, and its application to dietary assessment.” DOI: 10.1214/10-AOAS446SUPPA, DOI: 10.1214/10-AOAS446SUPPB, DOI: 10.1214/10-AOAS446SUPPC.

Supplemental materials