Open Access
Translator Disclaimer
April 2013 On the conditional distributions of low-dimensional projections from high-dimensional data
Hannes Leeb
Ann. Statist. 41(2): 464-483 (April 2013). DOI: 10.1214/12-AOS1081


We study the conditional distribution of low-dimensional projections from high-dimensional data, where the conditioning is on other low-dimensional projections. To fix ideas, consider a random $d$-vector $Z$ that has a Lebesgue density and that is standardized so that $\mathbb{E} Z=0$ and $\mathbb{E} ZZ'=I_{d}$. Moreover, consider two projections defined by unit-vectors $\alpha$ and $\beta$, namely a response $y=\alpha'Z$ and an explanatory variable $x=\beta'Z$. It has long been known that the conditional mean of $y$ given $x$ is approximately linear in $x$, under some regularity conditions; cf. Hall and Li [Ann. Statist. 21 (1993) 867–889]. However, a corresponding result for the conditional variance has not been available so far. We here show that the conditional variance of $y$ given $x$ is approximately constant in $x$ (again, under some regularity conditions). These results hold uniformly in $\alpha$ and for most $\beta$’s, provided only that the dimension of $Z$ is large. In that sense, we see that most linear submodels of a high-dimensional overall model are approximately correct. Our findings provide new insights in a variety of modeling scenarios. We discuss several examples, including sliced inverse regression, sliced average variance estimation, generalized linear models under potential link violation, and sparse linear modeling.


Download Citation

Hannes Leeb. "On the conditional distributions of low-dimensional projections from high-dimensional data." Ann. Statist. 41 (2) 464 - 483, April 2013.


Published: April 2013
First available in Project Euclid: 16 April 2013

zbMATH: 1360.62371
MathSciNet: MR3099110
Digital Object Identifier: 10.1214/12-AOS1081

Primary: 60F99
Secondary: 62H99

Keywords: Dimension reduction , high-dimensional models , regression , small sample size

Rights: Copyright © 2013 Institute of Mathematical Statistics


Vol.41 • No. 2 • April 2013
Back to Top