Institute of Mathematical Statistics Collections

Improved matrix uncertainty selector

Mathieu Rosenbaum and Alexandre B. Tsybakov

Full-text: Open access

Abstract

We consider the regression model with observation error in the design:

center\begin{eqnarray*}y&=&X\theta^*+\xi,\\ Z&=&X+\Xi.\end{eqnarray*}

Here the random vector $y\in\mathbb{R}^n$ and the random $n\times p$ matrix $Z$ are observed, the $n\times p$ matrix $X$ is unknown, $\Xi$ is an $n\times p$ random noise matrix, $\xi\in\mathbb{R}^n$ is a random noise vector, and $\theta^*$ is a vector of unknown parameters to be estimated. We consider the setting where the dimension $p$ can be much larger than the sample size $n$ and $\theta^*$ is sparse. Because of the presence of the noise matrix $\Xi$, the commonly used Lasso and Dantzig selector are unstable. An alternative procedure called the Matrix Uncertainty (MU) selector has been proposed in Rosenbaum and Tsybakov [ The Annals of Statistics 38 (2010) 2620–2651] in order to account for the noise. The properties of the MU selector have been studied in Rosenbaum and Tsybakov [ The Annals of Statistics 38 (2010) 2620–2651] for sparse $\theta^*$ under the assumption that the noise matrix $\Xi$ is deterministic and its values are small. In this paper, we propose a modification of the MU selector when $\Xi$ is a random matrix with zero-mean entries having the variances that can be estimated. This is, for example, the case in the model where the entries of $X$ are missing at random. We show both theoretically and numerically that, under these conditions, the new estimator called the Compensated MU selector achieves better accuracy of estimation than the original MU selector.

Chapter information

Source
Banerjee, M., Bunea, F., Huang, J., Koltchinskii, V., and Maathuis, M. H., eds., From Probability to Statistics and Back: High-Dimensional Models and Processes -- A Festschrift in Honor of Jon A. Wellner, (Beachwood, Ohio, USA: Institute of Mathematical Statistics, 2013) , 276-290

Dates
First available in Project Euclid: 8 March 2013

Permanent link to this document
https://projecteuclid.org/euclid.imsc/1362751194

Digital Object Identifier
doi:10.1214/12-IMSCOLL920

Mathematical Reviews number (MathSciNet)
MR3202640

Zentralblatt MATH identifier
1327.62410

Subjects
Primary: 62J05: Linear regression
Secondary: 62F12: Asymptotic properties of estimators

Keywords
Sparsity MU selector matrix uncertainty errors-in-variables model measurement error restricted eigenvalue assumption missing data

Rights
Copyright © 2010, Institute of Mathematical Statistics

Citation

Rosenbaum, Mathieu; Tsybakov, Alexandre B. Improved matrix uncertainty selector. From Probability to Statistics and Back: High-Dimensional Models and Processes -- A Festschrift in Honor of Jon A. Wellner, 276--290, Institute of Mathematical Statistics, Beachwood, Ohio, USA, 2013. doi:10.1214/12-IMSCOLL920. https://projecteuclid.org/euclid.imsc/1362751194


Export citation

References

  • [1] Belloni, A. and Chernozhukov, V. (2011). High dimensional sparse econometric models: an introduction. In: Inverse Problems and High Dimensional Estimation, Stats in the Château 2009, (Alquier, P., E. Gautier and G. Stoltz, eds.). Lecture Notes in Statistics 203 127–162. Springer, Berlin.
  • [2] Bickel, P. J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of Lasso and Dantzig selector. The Annals of Statistics 37 1705–1732.
  • [3] Bühlmann, P. and van de Geer, S. A. (2011). Statistics for High-Dimensional Data. Springer, New-York.
  • [4] Bunea, F., Tsybakov, A. B. and Wegkamp, M. H. (2007). Aggregation for Gaussian regression. The Annals of Statistics 35 1674–1697.
  • [5] Bunea, F., Tsybakov, A. B. and Wegkamp, M. H. (2007). Sparsity oracle inequalities for the Lasso. Electronic Journal of Statistics 1 169–194.
  • [6] Candès, E. J. and Tao, T. (2007). The Dantzig selector: Statistical estimation when $p$ is much larger than $n$ (with discussion). The Annals of Statistics 35 2313–2404.
  • [7] Gautier, E. and Tsybakov, A.B. (2011). High-dimensional instrumental variables regression and confidence sets. arXiv:1105.2454
  • [8] Koltchinskii, V. (2009). Dantzig selector and sparsity oracle inequalities. Bernoulli 15 799–828.
  • [9] Koltchinskii, V. (2011). Oracle inequalities in empirical risk minimization and sparse recovery problems. École d’Été de Probabilités de Saint-Flour 2008. Lecture Notes in Mathematics 2033.
  • [10] Lounici, K. (2008). Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators. Electronic Journal of Statistics 2 90–102.
  • [11] Petrov, V. V. (1995). Sums of Independent Random Variables. Oxford University Press.
  • [12] Rosenbaum, M. and Tsybakov A. B. (2010). Sparse recovery under matrix uncertainty. The Annals of Statistics 38 2620–2651.