February 2024 Concentration of measure bounds for matrix-variate data with missing values
Shuheng Zhou
Author Affiliations +
Bernoulli 30(1): 198-226 (February 2024). DOI: 10.3150/23-BEJ1594

Abstract

We consider the following data perturbation model, where the covariates incur multiplicative errors. For two n×m random matrices U,X, we denote by UX the Hadamard or Schur product, which is defined as (UX)ij=(Uij)(Xij). In this paper, we study the subgaussian matrix variate model, where we observe the matrix variate data X through a random mask U:

X=UXwhereX=B12ZA12,

where Z is a random matrix with independent subgaussian entries, and U is a mask matrix with either zero or positive entries, where EUij[0,1] and all entries are mutually independent. Under the assumption of independence between U and X, we introduce componentwise unbiased estimators for estimating covariance A and B, and prove the concentration of measure bounds in the sense of guaranteeing the restricted eigenvalue (RE) conditions to hold on the unbiased estimator for B, when columns of data matrix X are sampled with different rates. We further develop multiple regression methods for estimating the inverse of B and show statistical rate of convergence. Our results provide insight for sparse recovery for relationships among entities (samples, locations, items) when features (variables, time points, user ratings) are present in the observed data matrix X with heterogeneous rates. Our proof techniques can certainly be extended to other scenarios. We provide simulation evidence illuminating the theoretical predictions.

Acknowledgement

This paper is dedicated to the memory of my father. I would like to thank Tailen Hsing, Po-Ling Loh, and Mark Rudelson for helpful discussions and my family for their support. I thank Mark Rudelson for allowing me to present Theorem 3.12 in this paper. The author thanks the Editor Davy Paindaveine, an Associate Editor and two anonymous referees for their valuable comments and suggestions. Initial version of the manuscript titled “The Tensor Quadratic Forms” was posted as preprint arXiv:2008.03244 in August 2020.

Citation

Download Citation

Shuheng Zhou. "Concentration of measure bounds for matrix-variate data with missing values." Bernoulli 30 (1) 198 - 226, February 2024. https://doi.org/10.3150/23-BEJ1594

Information

Received: 1 June 2022; Published: February 2024
First available in Project Euclid: 8 November 2023

MathSciNet: MR4665575
zbMATH: 07788881
Digital Object Identifier: 10.3150/23-BEJ1594

Keywords: concentration of measure , Covariance estimation , inverse covariance estimation , matrix variate data , missing values , multiple regression , restricted Eigenvalue conditions , space-time model , sparse Hanson-Wright inequality , sparse quadratic forms , subgaussian concentration , subsampling

Vol.30 • No. 1 • February 2024
Back to Top