## The Annals of Applied Statistics

### Standardization of multivariate Gaussian mixture models and background adjustment of PET images in brain oncology

#### Abstract

In brain oncology, it is routine to evaluate the progress or remission of the disease based on the differences between a pre-treatment and a post-treatment Positron Emission Tomography (PET) scan. Background adjustment is necessary to reduce confounding by tissue-dependent changes not related to the disease. When modeling the voxel intensities for the two scans as a bivariate Gaussian mixture, background adjustment translates into standardizing the mixture at each voxel, while tumor lesions present themselves as outliers to be detected. In this paper, we address the question of how to standardize the mixture to a standard multivariate normal distribution, so that the outliers (i.e., tumor lesions) can be detected using a statistical test. We show theoretically and numerically that the tail distribution of the standardized scores is favorably close to standard normal in a wide range of scenarios while being conservative at the tails, validating voxelwise hypothesis testing based on standardized scores. To address standardization in spatially heterogeneous image data, we propose a spatial and robust multivariate expectation-maximization (EM) algorithm, where prior class membership probabilities are provided by transformation of spatial probability template maps and the estimation of the class mean and covariances are robust to outliers. Simulations in both univariate and bivariate cases suggest that standardized scores with soft assignment have tail probabilities that are either very close to or more conservative than standard normal. The proposed methods are applied to a real data set from a PET phantom experiment, yet they are generic and can be used in other contexts.

#### Article information

Source
Ann. Appl. Stat., Volume 12, Number 4 (2018), 2197-2227.

Dates
Revised: January 2018
First available in Project Euclid: 13 November 2018

https://projecteuclid.org/euclid.aoas/1542078042

Digital Object Identifier
doi:10.1214/18-AOAS1149

Mathematical Reviews number (MathSciNet)
MR3875698

#### Citation

Li, Meng; Schwartzman, Armin. Standardization of multivariate Gaussian mixture models and background adjustment of PET images in brain oncology. Ann. Appl. Stat. 12 (2018), no. 4, 2197--2227. doi:10.1214/18-AOAS1149. https://projecteuclid.org/euclid.aoas/1542078042

#### References

• Ashburner, J. (2012). SPM: A history. NeuroImage 62 791–800.
• Ashburner, J. and Friston, K. J. (2005). Unified segmentation. NeuroImage 26 839–851.
• Bai, B., Bading, J. and Conti, P. S. (2013). Tumor quantification in clinical positron emission tomography. Theranostics 3 787–801.
• Besag, J. (1986). On the statistical analysis of dirty pictures. J. Roy. Statist. Soc. Ser. B 48 259–302.
• Borghammer, P., Aanerud, J. and Gjedde, A. (2009). Data-driven intensity normalization of PET group comparison studies is superior to global mean normalization. NeuroImage 46 981–988.
• Campbell, N. A. (1984). Mixture models and atypical values. Math. Geol. 16 465–477.
• Chen, J. L., Gunn, S. R., Nixon, M. S. and Gunn, R. N. (2001). Markov random field models for segmentation of PET images. In Biennial International Conference on Information Processing in Medical Imaging 468–474. Springer, Berlin.
• Dasgupta, A., Hopcroft, J., Kleinberg, J. and Sandler, M. (2005). On learning mixtures of heavy-tailed distributions. In 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS’05) 491–500. IEEE, New York.
• Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B 39 1–38.
• Devlin, S. J., Gnanadesikan, R. and Kettenring, J. R. (1981). Robust estimation of dispersion matrices and principal components. J. Amer. Statist. Assoc. 76 354–362.
• Figueiredo, M. A. T. and Jain, A. K. (2002). Unsupervised learning of finite mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 24 381–396.
• Guo, M., Yap, J. T., den Abbeele, A. D., Lin, N. U. and Schwartzman, A. (2014). Voxelwise single-subject analysis of imaging metabolic response to therapy in neuro-oncology. Stat. 3 172–186.
• Gupta, M. R. and Chen, Y. (2011). Theory and Use of the EM Algorithm. Now Publishers Inc.
• Hanson, T. E. (2006). Inference for mixtures of finite Polya tree models. J. Amer. Statist. Assoc. 101 1548–1565.
• Hoffman, E. J., Cutler, P. D., Guerrero, T. M., Digby, W. M. and Mazziotta, J. C. (1991). Assessment of accuracy of PET utilizing a 3-D phantom to simulate the activity distribution of [18F] fluorodeoxyglucose uptake in the human brain. J. Cereb. Blood Flow Metab. 11 A17–A25.
• Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Stat. 35 73–101.
• Leahy, R. M. and Qi, J. (2000). Statistical approaches in quantitative positron emission tomography. Stat. Comput. 10 147–165.
• Lee, Y.-Y., Choi, C. H., Kim, C. J., Kang, H., Kim, T.-J., Lee, J.-W., Lee, J.-H., Bae, D.-S. and Kim, B.-G. (2009). The prognostic significance of the SUVmax (maximum standardized uptake value for F-18 fluorodeoxyglucose) of the cervical tumor in PET imaging for early cervical cancer: Preliminary results. Gynecol. Oncol. 115 65–68.
• Li, M. and Schwartzman, A. (2018). Supplement to “Standardization of multivariate Gaussian mixture models and background adjustment of PET images in brain oncology.” DOI:10.1214/18-AOAS1149SUPP.
• Lin, T. I., Lee, J. C. and Yen, S. Y. (2007). Finite mixture modelling using the skew normal distribution. Statist. Sinica 17 909–927.
• Lo, Y., Mendell, N. R. and Rubin, D. B. (2001). Testing the number of components in a normal mixture. Biometrika 88 767–778.
• Maronna, R. A. (1976). Robust $M$-estimators of multivariate location and scatter. Ann. Statist. 4 51–67.
• Maronna, R. A., Martin, R. D. and Yohai, V. J. (2006). Robust Statistics: Theory and Methods. Wiley, Chichester.
• McLachlan, G. J. and Basford, K. E. (1988). Mixture Models: Inference and Applications to Clustering. Statistics: Textbooks and Monographs 84. Dekker, New York.
• McLachlan, G. J. and Krishnan, T. (2008). The EM Algorithm and Extensions, 2nd ed. Wiley, Hoboken, NJ.
• McLachlan, G. and Peel, D. (2000). Finite Mixture Models. Wiley-Interscience, New York.
• Nguyen, T. M. and Wu, Q. M. J. (2012). Gaussian-mixture-model-based spatial neighborhood relationships for pixel labeling problem. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 42 193–202.
• O’Sullivan, F., Muzi, M., Mankoff, D. A., Eary, J. F., Spence, A. M. and Krohn, K. A. (2014). Voxel-level mapping of tracer kinetics in PET studies: A statistical approach emphasizing tissue life tables. Ann. Appl. Stat. 8 1065–1094.
• Peel, D. and McLachlan, G. J. (2000). Robust mixture modelling using the t distribution. Stat. Comput. 10 339–348.
• Qin, Y. and Priebe, C. E. (2013). Maximum ${\mathrm L}q$-likelihood estimation via the expectation-maximization algorithm: A robust estimation of mixture models. J. Amer. Statist. Assoc. 108 914–928.
• Qin, L., Schwartzman, A., McCall, K., Kachouie, N. N. and Yap, J. T. (2017). Method for detecting voxelwise changes in fluorodeoxyglucose-positron emission tomography brain images via background adjustment in cancer clinical trials. J. Med. Imag. 4 024006.
• Redner, R. A. and Walker, H. F. (1984). Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev. 26 195–239.
• Richardson, S. and Green, P. J. (1997). On Bayesian analysis of mixtures with an unknown number of components. J. Roy. Statist. Soc. Ser. B 59 731–792.
• Sanjay-Gopal, S. and Hebert, T. (1998). Bayesian pixel classification using spatially variant finite mixtures and the generalized EM algorithm. IEEE Trans. Image Process. 7 1014–1028.
• Soffientini, C. D., De Bernardi, E., Zito, F., Castellani, M. and Baselli, G. (2016). Background based Gaussian mixture model lesion segmentation in PET. Med. Phys. 43 2662–2675.
• Soret, M., Bacharach, S. L. and Buvat, I. (2007). Partial-volume effect in PET tumor imaging. J. Nucl. Med. 48 932–945.
• Stephens, M. (2000). Bayesian analysis of mixture models with an unknown number of components—An alternative to reversible jump methods. Ann. Statist. 28 40–74.
• Takeda, A., Yokosuka, N., Ohashi, T., Kunieda, E., Fujii, H., Aoki, Y., Sanuki, N., Koike, N. and Ozawa, Y. (2011). The maximum standardized uptake value (SUVmax) on FDG-PET is a strong predictor of local recurrence for localized non-small-cell lung cancer after stereotactic body radiotherapy (SBRT). Radiother. Oncol. 101 291–297.
• Valk, P. E., Bailey, D. L., Townsend, D. W. and Maisey, M. N. (2003). Positron Emission Tomography: Basic Science and Clinical Practice. Springer, London.
• Vehtari, A. and Ojanen, J. (2012). A survey of Bayesian predictive methods for model assessment, selection and comparison. Stat. Surv. 6 142–228.
• Venturini, S., Dominici, F. and Parmigiani, G. (2008). Gamma shape mixtures for heavy-tailed distributions. Ann. Appl. Stat. 2 756–776.
• Vlassis, N. and Likas, A. (1999). A kurtosis-based dynamic approach to Gaussian mixture modeling. IEEE Trans. Syst. Man Cybern., Part A, Syst. Humans 29 393–399.
• Wahl, R. L., Jacene, H., Kasamon, Y. and Lodge, M. A. (2009). From RECIST to PERCIST: Evolving considerations for PET response criteria in solid tumors. J. Nucl. Med. 50 122S–150S.
• Young, H., Baum, R., Cremerius, U., Herholz, K., Hoekstra, O., Lammertsma, A. A., Pruim, J., Price, P. and Others (1999). Measurement of clinical and subclinical tumour response using [18 F]-fluorodeoxyglucose and positron emission tomography: Review and 1999 EORTC recommendations. Eur. J. Cancer 35 1773–1782.
• Zasadny, K. R. and Wahl, R. L. (1993). Standardized uptake values of normal tissues at PET with 2-[fluorine-18]-fluoro-2-deoxy-D-glucose: Variations with body weight and a method for correction. Radiology 189 847–850.
• Zhang, J., Modestino, J. W. and Langan, D. A. (1994). Maximum-likelihood parameter estimation for unsupervised stochastic model-based image segmentation. IEEE Trans. Image Process. 3 404–420.

#### Supplemental materials

• Supplement: Additional material. The Supplementary material contains: (A) proofs of all theorems and lemmas in the main paper; (B) a simulation study to compare the proposed robust EM with the multivariate $t$ mixtures method [Peel and McLachlan (2000)] in a nonspatial setting; (C) additional simulation studies of the proposed RB-SGMM approach when lesions have smaller sizes and are noncircular.