Open Access
Translator Disclaimer
February 2020 Optimal prediction in the linearly transformed spiked model
Edgar Dobriban, William Leeb, Amit Singer
Ann. Statist. 48(1): 491-513 (February 2020). DOI: 10.1214/19-AOS1819


We consider the linearly transformed spiked model, where the observations $Y_{i}$ are noisy linear transforms of unobserved signals of interest $X_{i}$: \begin{equation*}Y_{i}=A_{i}X_{i}+\varepsilon_{i},\end{equation*} for $i=1,\ldots ,n$. The transform matrices $A_{i}$ are also observed. We model the unobserved signals (or regression coefficients) $X_{i}$ as vectors lying on an unknown low-dimensional space. Given only $Y_{i}$ and $A_{i}$ how should we predict or recover their values?

The naive approach of performing regression for each observation separately is inaccurate due to the large noise level. Instead, we develop optimal methods for predicting $X_{i}$ by “borrowing strength” across the different samples. Our linear empirical Bayes methods scale to large datasets and rely on weak moment assumptions.

We show that this model has wide-ranging applications in signal processing, deconvolution, cryo-electron microscopy, and missing data with noise. For missing data, we show in simulations that our methods are more robust to noise and to unequal sampling than well-known matrix completion methods.


Download Citation

Edgar Dobriban. William Leeb. Amit Singer. "Optimal prediction in the linearly transformed spiked model." Ann. Statist. 48 (1) 491 - 513, February 2020.


Received: 1 September 2017; Revised: 1 January 2019; Published: February 2020
First available in Project Euclid: 17 February 2020

zbMATH: 07196548
MathSciNet: MR4065171
Digital Object Identifier: 10.1214/19-AOS1819

Primary: 62H25
Secondary: 45B05 , 62H15

Keywords: high dimensional , Matrix completion , missing data , Principal Component Analysis , Random matrix theory , shrinkage , spiked model

Rights: Copyright © 2020 Institute of Mathematical Statistics


Vol.48 • No. 1 • February 2020
Back to Top