Abstract
This paper studies the multi-task high-dimensional linear regression models where the noise among different tasks is correlated, in the moderately high dimensional regime where sample size n and dimension p are of the same order. Our goal is to estimate the covariance matrix of the noise random vectors, or equivalently the correlation of the noise variables on any pair of two tasks. Treating the regression coefficients as a nuisance parameter, we leverage the multi-task elastic-net and multi-task lasso estimators to estimate the nuisance. By precisely understanding the bias of the squared residual matrix and by correcting this bias, we develop a novel estimator of the noise covariance that converges in Frobenius norm at the rate when the covariates are Gaussian distributed with a known covariance matrix. This novel estimator is efficiently computable. Under suitable conditions, the proposed estimator of the noise covariance attains the same rate of convergence as the “oracle” estimator that knows in advance the regression coefficients of the multi-task model. The Frobenius error bounds obtained in this paper also illustrate the advantage of this new estimator compared to a method-of-moments estimator that does not attempt to estimate the nuisance. As byproducts of our techniques, we obtain estimates of the generalization error and out-of-sample error of the multi-task elastic-net and multi-task lasso estimators. Extensive simulation studies are carried out to illustrate the numerical performance of the proposed method.
Funding Statement
Pierre C. Bellec is partially supported by the NSF Grants DMS-1811976 and DMS-1945428. Gabriel Romon is supported by a PhD scholarship from CREST.
Acknowledgements
The authors would like to thank the anonymous referees, an Associate Editor and the Editor for their constructive comments that improved the quality of this paper. The authors acknowledge the Office of Advanced Research Computing (OARC) at Rutgers, The State University of New Jersey for providing access to the Amarel cluster and associated research computing resources that have contributed to the results reported here. URL: https://oarc.rutgers.edu.
Citation
Kai Tan. Gabriel Romon. Pierre C. Bellec. "Noise covariance estimation in multi-task high-dimensional linear models." Bernoulli 30 (3) 1695 - 1722, August 2024. https://doi.org/10.3150/23-BEJ1644
Information