## The Annals of Statistics

- Ann. Statist.
- Volume 45, Number 1 (2017), 1-38.

### Tensor decompositions and sparse log-linear models

James E. Johndrow, Anirban Bhattacharya, and David B. Dunson

#### Abstract

Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. We derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions.

#### Article information

**Source**

Ann. Statist. Volume 45, Number 1 (2017), 1-38.

**Dates**

Received: April 2014

Revised: November 2015

First available in Project Euclid: 21 February 2017

**Permanent link to this document**

https://projecteuclid.org/euclid.aos/1487667616

**Digital Object Identifier**

doi:10.1214/15-AOS1414

**Zentralblatt MATH identifier**

1367.62180

**Subjects**

Primary: 62F15: Bayesian inference

**Keywords**

Bayesian categorical data contingency table latent class analysis graphical model high-dimensional low rank Parafac Tucker sparsity

#### Citation

Johndrow, James E.; Bhattacharya, Anirban; Dunson, David B. Tensor decompositions and sparse log-linear models. Ann. Statist. 45 (2017), no. 1, 1--38. doi:10.1214/15-AOS1414. https://projecteuclid.org/euclid.aos/1487667616

#### Supplemental materials

- Supplement to: “Tensor decompositions and sparse log-linear models”. We provide a supplement with three parts. In the first part, we provide a proof of Remark 3.4 and a constructive proof of a bound on nonnegative rank for $d^{2}$ tensors corresponding to sparse log-linear models. The second part provides an MCMC algorithm for posterior computation in c-Tucker models and the third part provides supplementary figures and tables for Section 5.Digital Object Identifier: doi:10.1214/15-AOS1414SUPPSupplemental files are immediately available to subscribers. Non-subscribers gain access to supplemental files with the purchase of the article.