February 2022 Tensor clustering with planted structures: Statistical optimality and computational limits
Yuetian Luo, Anru R. Zhang
Author Affiliations +
Ann. Statist. 50(1): 584-613 (February 2022). DOI: 10.1214/21-AOS2123

Abstract

This paper studies the statistical and computational limits of high-order clustering with planted structures. We focus on two clustering models, constant high-order clustering (CHC) and rank-one higher-order clustering (ROHC), and study the methods and theory for testing whether a cluster exists (detection) and identifying the support of cluster (recovery).

Specifically, we identify the sharp boundaries of signal-to-noise ratio for which CHC and ROHC detection/recovery are statistically possible. We also develop the tight computational thresholds: when the signal-to-noise ratio is below these thresholds, we prove that polynomial-time algorithms cannot solve these problems under the computational hardness conjectures of hypergraphic planted clique (HPC) detection and hypergraphic planted dense subgraph (HPDS) recovery. We also propose polynomial-time tensor algorithms that achieve reliable detection and recovery when the signal-to-noise ratio is above these thresholds. Both sparsity and tensor structures yield the computational barriers in high-order tensor clustering. The interplay between them results in significant differences between high-order tensor clustering and matrix clustering in literature in aspects of statistical and computational phase transition diagrams, algorithmic approaches, hardness conjecture, and proof techniques. To our best knowledge, we are the first to give a thorough characterization of the statistical and computational trade-off for such a double computational-barrier problem. Finally, we provide evidence for the computational hardness conjectures of HPC detection (via low-degree polynomial and Metropolis methods) and HPDS recovery (via low-degree polynomial method).

Funding Statement

This work was supported in part by NSF Grant CAREER-1944904, NSF Grants DMS-1811868 and DMS-2023239, NIH Grant R01 GM131399, and Wisconsin Alumni Research Foundation (WARF).

Acknowledgment

We would like to thank Guy Bresler for the helpful discussions. We also thank the Editor, Associate Editor, and two anonymous referees for their helpful suggestions, which helped improve the presentation and quality of this paper.

Citation

Download Citation

Yuetian Luo. Anru R. Zhang. "Tensor clustering with planted structures: Statistical optimality and computational limits." Ann. Statist. 50 (1) 584 - 613, February 2022. https://doi.org/10.1214/21-AOS2123

Information

Received: 1 August 2020; Revised: 1 May 2021; Published: February 2022
First available in Project Euclid: 16 February 2022

MathSciNet: MR4382029
zbMATH: 1486.62187
Digital Object Identifier: 10.1214/21-AOS2123

Subjects:
Primary: 62H15
Secondary: 62C20

Keywords: Average-case complexity , high-order clustering , hypergraphic planted clique , hypergraphic planted dense subgraph , statistical-computational phase transition

Rights: Copyright © 2022 Institute of Mathematical Statistics

JOURNAL ARTICLE
30 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.50 • No. 1 • February 2022
Back to Top