Impacts of high dimensionality in finite samples

Jinchi Lv

doi:10.1214/13-AOS1149

August 2013 Impacts of high dimensionality in finite samples

Jinchi Lv

Ann. Statist. 41(4): 2236-2262 (August 2013). DOI: 10.1214/13-AOS1149

Abstract

High-dimensional data sets are commonly collected in many contemporary applications arising in various fields of scientific research. We present two views of finite samples in high dimensions: a probabilistic one and a nonprobabilistic one. With the probabilistic view, we establish the concentration property and robust spark bound for large random design matrix generated from elliptical distributions, with the former related to the sure screening property and the latter related to sparse model identifiability. An interesting concentration phenomenon in high dimensions is revealed. With the nonprobabilistic view, we derive general bounds on dimensionality with some distance constraint on sparse models. These results provide new insights into the impacts of high dimensionality in finite samples.