The Annals of Applied Statistics

A multivariate adaptive stochastic search method for dimensionality reduction in classification

Tian Siva Tian, Gareth M. James, and Rand R. Wilcox
Source: Ann. Appl. Stat. Volume 4, Number 1 (2010), 340-365.

Abstract

High-dimensional classification has become an increasingly important problem. In this paper we propose a “Multivariate Adaptive Stochastic Search” (MASS) approach which first reduces the dimension of the data space and then applies a standard classification method to the reduced space. One key advantage of MASS is that it automatically adjusts to mimic variable selection type methods, such as the Lasso, variable combination methods, such as PCA, or methods that combine these two approaches. The adaptivity of MASS allows it to perform well in situations where pure variable selection or variable combination methods fail. Another major advantage of our approach is that MASS can accurately project the data into very low-dimensional non-linear, as well as linear, spaces. MASS uses a stochastic search algorithm to select a handful of optimal projection directions from a large number of random directions in each iteration. We provide some theoretical justification for MASS and demonstrate its strengths on an extensive range of simulation studies and real world data sets by comparing it to many classical and modern classification methods.

First Page: Show Hide
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aoas/1273584458
Digital Object Identifier: doi:10.1214/09-AOAS284
Zentralblatt MATH identifier: 1189.62106
Mathematical Reviews number (MathSciNet): MR2758175

References

Candes, E. and Tao, T. (2007). The dantzig selector: Statistical estimation when p is much larger than n (with discussion). Ann. Statist. 35 2313–2351.
Mathematical Reviews (MathSciNet): MR2382644
Zentralblatt MATH: 1139.62019
Digital Object Identifier: doi:10.1214/009053606000001523
Project Euclid: euclid.aos/1201012958
Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407–451.
Mathematical Reviews (MathSciNet): MR2060166
Zentralblatt MATH: 1091.62054
Digital Object Identifier: doi:10.1214/009053604000000067
Project Euclid: euclid.aos/1083178935
Fan, J. and Fan, Y. (2008). High dimensional classification using features annealed independence rules. Ann. Statist. 36 2605–2637.
Mathematical Reviews (MathSciNet): MR2485009
Zentralblatt MATH: 05503372
Digital Object Identifier: doi:10.1214/07-AOS504
Project Euclid: euclid.aos/1231165181
Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
Mathematical Reviews (MathSciNet): MR1946581
Zentralblatt MATH: 1073.62547
Digital Object Identifier: doi:10.1198/016214501753382273
Fan, J. and Lv, J. (2008). Sure independence screening for ultra-high dimensional feature space. J. Roy. Statist. Soc. Ser. B 70 849–911.
Mathematical Reviews (MathSciNet): MR2530322
Digital Object Identifier: doi:10.1111/j.1467-9868.2008.00674.x
Field, C. and Genton, M. G. (2006). The multivariate g-and-h distribution. Technometrics 48 104–111.
Mathematical Reviews (MathSciNet): MR2236532
Digital Object Identifier: doi:10.1198/004017005000000562
George, E. I. and McCulloch, R. E. (1993). Variable selection via gibbs sampling. J. Amer. Statist. Assoc. 88 881–889.
George, E. I. and McCulloch, R. E. (1997). Approaches for bayesian variable selection. Statistica Sinica 7 339–373.
Golub, T., Slonim, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, H., Loh, M., Downing, J., Caligiuri, M., Bloomfield, C. and Lander, E. (1999). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286 531–537.
Zentralblatt MATH: 1047.65504
Gosavi, A. (2003). Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning. Kluwer, Boston.
Mathematical Reviews (MathSciNet): MR1996863
Zentralblatt MATH: 1030.90147
Knight, L. and Fu, W. (2000). Asymptotics for lasso-type estimators. Ann. Statist. 28 1356–1378.
Mathematical Reviews (MathSciNet): MR1805787
Zentralblatt MATH: 1105.62357
Digital Object Identifier: doi:10.1214/aos/1015957397
Project Euclid: euclid.aos/1015957397
Liberti, L. and Kucherenko, S. (2005). Comparison of deterministic and stochastic approaches to global optimization. Int. Trans. Oper. Res. 12 263–285.
Mathematical Reviews (MathSciNet): MR2141235
Zentralblatt MATH: 1131.90437
Digital Object Identifier: doi:10.1111/j.1475-3995.2005.00503.x
Mitchell, T. J. and Beauchamp, J. J. (1988). Bayesian variable selection in linear regression (with discussion). J. Amer. Statist. Assoc. 83 1023–1036.
Mathematical Reviews (MathSciNet): MR997578
Zentralblatt MATH: 0673.62051
Digital Object Identifier: doi:10.1080/01621459.1988.10478694
Park, M.-Y. and Hastie, T. (2007). An l1 regularization-path algorithm for generalized linear models. J. Roy. Statist. Soc. Ser. B 69 659–677.
Mathematical Reviews (MathSciNet): MR2370074
Digital Object Identifier: doi:10.1111/j.1467-9868.2007.00607.x
Radchenko, P. and James, G. (2008). Variable inclusion and shrinkage algorithms. J. Amer. Statist. Assoc. 103 1304–1315.
Mathematical Reviews (MathSciNet): MR2462899
Zentralblatt MATH: 1205.62100
Digital Object Identifier: doi:10.1198/016214508000000481
Radchenko, P. and James, G. (2009). Forward-lasso with adaptive shrinkage. Under review.
Reinsch, C. (1967). Smoothing by spline functions. Numer. Math. 10 177–183.
Mathematical Reviews (MathSciNet): MR295532
Digital Object Identifier: doi:10.1007/BF02162161
Roweis, S. and Saul, L. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science 290 2323–2326.
Tenenbaum, J., de Silva, V. and Langford, J. (2000). A global geometric framework for nonlinear dimensionality reduction. Science 290 2319–2323.
Tian, T. S., Wilcox, R. R. and James, G. M. (2009). Data reduction in classification: A simulated annealing based projection method. Under review.
Mathematical Reviews (MathSciNet): MR2726242
Digital Object Identifier: doi:10.1002/sam.10087
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
Mathematical Reviews (MathSciNet): MR1379242
Tibshirani, R., Hastie, T., Narasimhan, B. and Chu, G. (2002). Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Natl. Acad. Sci. USA 99 6567–6572.
Tsybakov, A. and van de Geer, S. (2005). Square root penalty: Adaptation to the margin in classification and in edge estimation. Ann. Statist. 33 1203–1224.
Mathematical Reviews (MathSciNet): MR2195633
Digital Object Identifier: doi:10.1214/009053604000001066
Project Euclid: euclid.aos/1120224100
Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. J. Roy. Statist. Soc. Ser. B 67 301–320.
Mathematical Reviews (MathSciNet): MR2137327
Zentralblatt MATH: 1069.62054
Digital Object Identifier: doi:10.1111/j.1467-9868.2005.00503.x

2013 © Institute of Mathematical Statistics

The Annals of Applied Statistics

The Annals of Applied Statistics

Turn MathJax Off
What is MathJax?