The Annals of Applied Statistics

Multivariate Bayesian semiparametric models for authentication of food and beverages

Luis Gutiérrez and Fernando A. Quintana

Full-text: Open access


Food and beverage authentication is the process by which foods or beverages are verified as complying with its label description, for example, verifying if the denomination of origin of an olive oil bottle is correct or if the variety of a certain bottle of wine matches its label description. The common way to deal with an authentication process is to measure a number of attributes on samples of food and then use these as input for a classification problem. Our motivation stems from data consisting of measurements of nine chemical compounds denominated Anthocyanins, obtained from samples of Chilean red wines of grape varieties Cabernet Sauvignon, Merlot and Carménère. We consider a model-based approach to authentication through a semiparametric multivariate hierarchical linear mixed model for the mean responses, and covariance matrices that are specific to the classification categories. Specifically, we propose a model of the ANOVA-DDP type, which takes advantage of the fact that the available covariates are discrete in nature. The results suggest that the model performs well compared to other parametric alternatives. This is also corroborated by application to simulated data.

Article information

Ann. Appl. Stat., Volume 5, Number 4 (2011), 2385-2402.

First available in Project Euclid: 20 December 2011

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Classification dependent Dirichlet process wines


Gutiérrez, Luis; Quintana, Fernando A. Multivariate Bayesian semiparametric models for authentication of food and beverages. Ann. Appl. Stat. 5 (2011), no. 4, 2385--2402. doi:10.1214/11-AOAS492.

Export citation


  • Antoniak, C. E. (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann. Statist. 2 1152–1174.
  • Berente, B., De la Calle García, D., Reichenbächer, M. and Danzer, K. (2000). Method development for the determination of anthocyanins in red wines by high-performance liquid chromatography and classification of German red wines by means of multivariate statistical methods. J. Chromatogr. A 871 95–103.
  • Brown, P. J., Fearn, T. and Haque, M. S. (1999). Discrimination with many variables. J. Amer. Statist. Assoc. 94 1320–1329.
  • Caron, F., Davy, M., Doucet, A., Duflos, E. and Vanheeghe, P. (2006). Bayesian inference for dynamic models with Dirichlet process mixtures. In International Conference on Information Fusion. Florence, Italy.
  • Celeux, G., Forbes, F., Robert, C. P. and Titterington, D. M. (2006). Deviance information criteria for missing data models. Bayesian Anal. 1 651–673 (electronic).
  • Chen, M.-H., Shao, Q.-M. and Ibrahim, J. G. (2000). Monte Carlo Methods in Bayesian Computation. Springer, New York.
  • De Iorio, M., Müller, P., Rosner, G. L. and MacEachern, S. N. (2004). An ANOVA model for dependent random measures. J. Amer. Statist. Assoc. 99 205–215.
  • De Iorio, M., Johnson, W. O., Müller, P. and Rosner, G. L. (2009). Bayesian nonparametric nonproportional hazards survival modeling. Biometrics 65 762–771.
  • De la Cruz-Mesía, R. and Quintana, F. (2007). A model-based approach to Bayesian classification with applications to predicting pregnancy outcomes from longitudinal. Biostatistics 8 228–238.
  • De la Cruz-Mesía, R., Quintana, F. A. and Müller, P. (2007). Semiparametric Bayesian classification with longitudinal markers. J. Roy. Statist. Soc. Ser. C 56 119–137.
  • Dean, N., Murphy, T. B. and Downey, G. (2006). Using unlabelled data to update classification rules with applications in food authenticity studies. J. Roy. Statist. Soc. Ser. C 55 1–14.
  • Dey, D., Müller, P. and Sinha, D., eds. (1998). Practical Nonparametric and Semiparametric Bayesian Statistics. Lecture Notes in Statistics 133. Springer, New York.
  • Dunson, D. B. and Park, J.-H. (2008). Kernel stick-breaking processes. Biometrika 95 307–323.
  • Dunson, D. B., Pillai, N. and Park, J.-H. (2007). Bayesian density regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 163–183.
  • Eder, R., Wendelin, S. and Barna, J. (1994). Classification of red wine cultivars by means of anthocyanin analysis. 1st Report: Application of multivariate statistical methods for differentiation of grape samples. Mitteilungen Klosterneubug 44 201–212.
  • Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209–230.
  • Geisser, S. and Eddy, W. F. (1979). A predictive approach to model selection. J. Amer. Statist. Assoc. 74 153–160.
  • Gelfand, A. E., Kottas, A. and MacEachern, S. N. (2005). Bayesian nonparametric spatial modeling with Dirichlet process mixing. J. Amer. Statist. Assoc. 100 1021–1035.
  • Griffin, J. E. and Steel, M. F. J. (2006). Order-based dependent Dirichlet processes. J. Amer. Statist. Assoc. 101 179–194.
  • Gutiérrez, L., Quintana, F., von Baer, D. and Mardones, C. (2011). Multivariate Bayesian discrimination for varietal authentication of Chilean red wine. J. Appl. Statist. 38 2099–2109.
  • Hastie, T., Tibshirani, R. and Friedman, J. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York.
  • Hinrichsen, P., Narvaez, C., Bowers, J., Boursiquot, J., Valenzuela, J., Muñoz, C. and Meredith, C. (2001). Distinguishing Carmenère from similar cultivars by DNA typing. American Journal of Enology and Viticulture 52 396–399.
  • Hjort, N. L., Holmes, C., Müller, P. and Walker, S. G., eds. (2010). Bayesian Nonparametrics. Cambridge Univ. Press, Cambridge.
  • Holbach, B., Marx, R. and Ackermann, M. (1997). Bestimmung der anthocyanzusammenset-zung von rotwein mittels hochdruckflüssig chromatographi. Lebensmittelchemie 51 78–80.
  • Holbach, B., Marx, R. and Ackerman, M. (2001). Bedeutung der shikimisäure und des anthocyanspek-trums für die charakterisierung von rebsorten. Lebensmittelchenie 55 32–34.
  • Jara, A., Lesaffre, E., Iorio, M. D. and Quintana, F. (2010). Bayesian semiparametric inference for multivariate doubly-interval-censored data. Ann. Appl. Statist. 4 2126–2149.
  • MacEachern, S. (1999). Dependent nonparametric processes. In Proc. Bayesian Statistical Science 50–55. Amer. Statist. Assoc., Alexandria, VA.
  • Mafra, I., Isabel, M. P., Ferreira, P., Beatriz, M. and Oliveira, P. (2008). Food authentication by PCR-based methods. European Food Research Technology 277 649–665.
  • Müller, P., Erkanli, A. and West, M. (1996). Bayesian curve fitting using multivariate normal mixtures. Biometrika 83 67–79.
  • Müller, P. and Quintana, F. A. (2004). Nonparametric Bayesian data analysis. Statist. Sci. 19 95–110.
  • Neal, R. M. (2000). Markov chain sampling methods for Dirichlet process mixture models. J. Comput. Graph. Statist. 9 249–265.
  • OIV (2003). Resolution OENO 22/2003. International Organization of Vine and Wine, Paris.
  • Otteneder, H., Marx, R. and Zimmer, M. (2004). Analysis of anthocyanin composition of Cabernet Sauvignon and Portugieser wines provides an objective assessment of the grape varieties. Journal of Grape Wine Research 10 3–7.
  • Otteneder, H., Holbach, B., Marx, R. and Zimmer, M. (2002). Rebsortenbestimmung in Rotwein anhand der Anthocyanspektren. Mitteilungen Klosterneuburg 52 187–194.
  • Revilla, E., Garcia-Beneytez, E., Cabello, F., Martin-Ortega, G. and Ryan, J. (2001). Value of high-performance liquid chromatographic analysis of anthocyanins in the differentiation of red grape cultivars and red wines made from them. Journal of Chromatography A 915 53–60.
  • Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statist. Sinica 4 639–650.
  • Spiegelhalter, D. J., Best, N. G., Carlin, B. P. and van der Linde, A. (2002). Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 583–639.
  • Toher, D., Downey, G. and Brendan, T. (2007). A comparison of model-based and regression classification techniques applied to near infrared spectroscopic data in food authentication studies. Chemometrics and Intelligent Laboratory Systems 89 102–115.
  • von Baer, D., Mardones, C., Gutiérrez, L., Hofmann, G., Becerra, J., Hitschfeld, A. and Vergara, C. (2005). Varietal authenticity verification of Cabernet sauvignon, Merlot and Carmenère wines produced in Chile by their Anthocyanin, Flavonol and Shikimic acid profiles. Le Bulletin de L’OIV 78 45–57.
  • von Baer, D., Mardones, C., Gutiérrez, L., Hofmann, G., Becerra, J., Hitschfeld, A. and Vergara, C. (2007). Anthocyanin, Flavonol, and Shikimic Acid Profiles as a Tool to Verify Varietal Authenticity in Red Wines Produced in Chile. ACS Symposium Series 952. American Chemical Society, Washington, DC.
  • Winterhalter, P. (2007). Authentification of Food and Wine. ACS Symposium Series 952. American Chemical Society, Washington, DC.