Electronic Journal of Statistics

Supervised multiway factorization

Eric F. Lock and Gen Li

Full-text: Open access


We describe a probabilistic PARAFAC/CANDECOMP (CP) factorization for multiway (i.e., tensor) data that incorporates auxiliary covariates, SupCP. SupCP generalizes the supervised singular value decomposition (SupSVD) for vector-valued observations, to allow for observations that have the form of a matrix or higher-order array. Such data are increasingly encountered in biomedical research and other fields. We use a novel likelihood-based latent variable representation of the CP factorization, in which the latent variables are informed by additional covariates. We give conditions for identifiability, and develop an EM algorithm for simultaneous estimation of all model parameters. SupCP can be used for dimension reduction, capturing latent structures that are more accurate and interpretable due to covariate supervision. Moreover, SupCP specifies a full probability distribution for a multiway data observation with given covariate values, which can be used for predictive modeling. We conduct comprehensive simulations to evaluate the SupCP algorithm. We apply it to a facial image database with facial descriptors (e.g., smiling / not smiling) as covariates, and to a study of amino acid fluorescence. Software is available at https://github.com/lockEF/SupCP.

Article information

Electron. J. Statist., Volume 12, Number 1 (2018), 1150-1180.

Received: April 2017
First available in Project Euclid: 27 March 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Faces in the wild dimension reduction latent variables parafac/candecomp singular value decomposition tensors

Creative Commons Attribution 4.0 International License.


Lock, Eric F.; Li, Gen. Supervised multiway factorization. Electron. J. Statist. 12 (2018), no. 1, 1150--1180. doi:10.1214/18-EJS1421. https://projecteuclid.org/euclid.ejs/1522116042

Export citation


  • [1] Allen, G. (2012). Sparse higher-order principal components analysis. In, International Conference on Artificial Intelligence and Statistics 27–36.
  • [2] Allen, G. I. (2013). Multi-way functional principal components analysis. In, Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 2013 IEEE 5th International Workshop on 220–223. IEEE.
  • [3] Andersen, C. M. and Bro, R. (2003). Practical aspects of PARAFAC modeling of fluorescence excitation-emission data., Journal of Chemometrics 17 200–215.
  • [4] Bair, E., Hastie, T., Paul, D. and Tibshirani, R. (2006). Prediction by supervised principal components., Journal of the American Statistical Association 101 119–137.
  • [5] Bauckhage, C. (2007). Robust tensor classifiers for color object recognition. In, Image Analysis and Recognition 352–363. Springer.
  • [6] Björck, A. and Golub, G. H. (1973). Numerical methods for computing angles between linear subspaces., Mathematics of computation 27 579–594.
  • [7] Bro, R. (1998). Multi-way Analysis in the Food Industry, Models, Algorithms and Applications., Doctoral Thesis, University of Amsterdam.
  • [8] Crainiceanu, C. M., Caffo, B. S., Luo, S., Zipunnikov, V. M. and Punjabi, N. M. (2011). Population Value Decomposition, a framework for the analysis of image populations., Journal of the American Statistical Association 106 775–790.
  • [9] Fan, J., Liao, Y., Wang, W. et al. (2016). Projected principal component analysis in factor models., The Annals of Statistics 44 219–254.
  • [10] Fosdick, B. K. and Hoff, P. D. (2014). Separable factor analysis with applications to mortality data., The annals of applied statistics 8 120.
  • [11] Harshman, R. A. (1970). Foundations of the PARAFAC procedure: Models and conditions for an “explanatory" multi-modal factor analysis., UCLA Working Papers in Phonetics 16 1–84.
  • [12] Hassner, T., Harel, S., Paz, E. and Enbar, R. (2015). Effective face frontalization in unconstrained images. In, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 4295–4304.
  • [13] Hoff, P. D. et al. (2011). Separable covariance arrays via the Tucker product, with applications to multivariate relational data., Bayesian Analysis 6 179–196.
  • [14] Hoff, P. D. et al. (2015). Multilinear tensor regression for longitudinal relational data., The Annals of Applied Statistics 9 1169–1193.
  • [15] Kolda, T. G. and Bader, B. W. (2009). Tensor decompositions and applications., SIAM review 51 455–500.
  • [16] Kumar, N., Berg, A. C., Belhumeur, P. N. and Nayar, S. K. (2009). Attribute and simile classifiers for face verification. In, 2009 IEEE 12th International Conference on Computer Vision 365–372. IEEE.
  • [17] Learned-Miller, E., Huang, G. B., RoyChowdhury, A., Li, H. and Hua, G. (2016). Labeled faces in the wild: A survey. In, Advances in Face Detection and Facial Image Analysis 189–248. Springer.
  • [18] Li, G., Shen, H. and Huang, J. Z. (2016). Supervised sparse and functional principal component analysis., Journal of Computational and Graphical Statistics 25 859–878.
  • [19] Li, G., Yang, D., Nobel, A. B. and Shen, H. (2016). Supervised singular value decomposition and its asymptotic properties., Journal of Multivariate Analysis 146 7–17.
  • [20] Li, L. and Zhang, X. (2017). Parsimonious tensor response regression., Journal of the American Statistical Association 112 1131–1146.
  • [21] Lock, E. F., Nobel, A. B. and Marron, J. S. (2011). Comment on Population Value Decomposition, a framework for the analysis of image populations., Journal of the American Statistical Association 106 798–802.
  • [22] Lyu, T., Lock, E. F. and Eberly, L. E. (2017). Discriminating sample groups with multi-way data., Biostatistics 18 434–450.
  • [23] Mayekawa, S. (1987). Maximum likelihood solution to the PARAFAC model., Behaviormetrika 14 45–63.
  • [24] Miranda, M., Zhu, H. and Ibrahim, J. G. (2015). TPRM: Tensor partition regression models with applications in imaging biomarker detection., arXiv preprint arXiv:1505.05482.
  • [25] Ohlson, M., Ahmad, M. R. and Von Rosen, D. (2013). The multilinear normal distribution: Introduction and some basic properties., Journal of Multivariate Analysis 113 37–47.
  • [26] Press, W. H., Flannery, B. P., Teukolsky, S. A. and Vetterling, W. T. (1986). Numerical recipes: the art of scientific computing., Cambridge U. Press, Cambridge, MA.
  • [27] Sidiropoulos, N. D. and Bro, R. (2000). On the uniqueness of multilinear decomposition of N-way arrays., Journal of Chemometrics 14 229–239.
  • [28] Sirovich, L. and Kirby, M. (1987). Low-dimensional procedure for the characterization of human faces., Journal of the Optical Society of America A 4 519–524.
  • [29] Tao, D., Li, X., Wu, X., Hu, W. and Maybank, S. J. (2007). Supervised tensor learning., Knowledge and information systems 13 1–42.
  • [30] Tucker, L. R. (1966). Some mathematical notes on three-mode factor analysis., Psychometrika 31 279–311.
  • [31] Turk, M. and Pentland, A. (1991). Eigenfaces for recognition., Journal of cognitive neuroscience 3 71–86.
  • [32] Vega-Montoto, L. and Wentzell, P. D. (2003). Maximum likelihood parallel factor analysis (MLPARAFAC)., Journal of Chemometrics 17 237–253.
  • [33] Yan, X., Yang, J., Sohn, K. and Lee, H. (2016). Attribute2image: Conditional image generation from visual attributes. In, European Conference on Computer Vision 776–791. Springer.
  • [34] Zhou, H., Li, L. and Zhu, H. (2013). Tensor regression with applications in neuroimaging data analysis., Journal of the American Statistical Association 108 540–552.
  • [35] Zhou, J., Bhattacharya, A., Herring, A. H. and Dunson, D. B. (2015). Bayesian factorizations of big sparse tensors., Journal of the American Statistical Association 110 1562–1576.