Abstract
We study the tensor-on-tensor regression, where the goal is to connect tensor responses to tensor covariates with a low Tucker rank parameter tensor/matrix without prior knowledge of its intrinsic rank. We propose the Riemannian gradient descent (RGD) and Riemannian Gauss–Newton (RGN) methods and cope with the challenge of unknown rank by studying the effect of rank over-parameterization. We provide the first convergence guarantee for the general tensor-on-tensor regression by showing that RGD and RGN respectively converge linearly and quadratically to a statistically optimal estimate in both rank correctly-parameterized and over-parameterized settings. Our theory reveals an intriguing phenomenon: Riemannian optimization methods naturally adapt to over-parameterization without modifications to their implementation. We also prove the statistical-computational gap in scalar-on-tensor regression by a direct low-degree polynomial argument. Our theory demonstrates a “blessing of statistical-computational gap” phenomenon: in a wide range of scenarios in tensor-on-tensor regression for tensors of order three or higher, the computationally required sample size matches what is needed by moderate rank over-parameterization when considering computationally feasible estimators, while there are no such benefits in the matrix settings. This shows moderate rank over-parameterization is essentially “cost-free” in terms of sample size in tensor-on-tensor regression of order three or higher. Finally, we conduct simulation studies to show the advantages of our proposed methods and to corroborate our theoretical findings.
Funding Statement
The research is supported in part by NSF Grant CAREER-2203741.
Acknowledgements
The authors would like to thank Ilias Diakonikolas and Daniel Kane for helpful discussions. Diakonikolas and Kane developed a computational lower bound in the statistical query model (which further yields a low-degree polynomial computational lower bound) for low-rank scalar-on-tensor rank-one regression before this work; and the proof was later incorporated into a full paper in Diakonikolas et al. (2023). However, the low-degree polynomial computational lower bound in Theorem 9 of this paper is tighter and its proof is direct and arguably simpler. We also thank the Editor, the Associated Editor and two anonymous referees for their helpful suggestions, which helped improve the presentation and quality of this paper.
Citation
Yuetian Luo. Anru R. Zhang. "Tensor-on-tensor regression: Riemannian optimization, over-parameterization, statistical-computational gap and their interplay." Ann. Statist. 52 (6) 2583 - 2612, December 2024. https://doi.org/10.1214/24-AOS2396
Information