Electronic Journal of Statistics
- Electron. J. Statist.
- Volume 4 (2010), 1055-1096.
Sparse regression with exact clustering
Abstract
This paper studies a generic sparse regression problem with a customizable sparsity pattern matrix, motivated by, but not limited to, a supervised gene clustering problem in microarray data analysis. The clustered lasso method is proposed with the l1-type penalties imposed on both the coefficients and their pairwise differences. Somewhat surprisingly, it behaves differently than the lasso or the fused lasso – the exact clustering effect expected from the l1 penalization is rarely seen in applications. An asymptotic study is performed to investigate the power and limitations of the l1-penalty in sparse regression. We propose to combine data-augmentation and weights to improve the l1 technique. To address the computational issues in high dimensions, we successfully generalize a popular iterative algorithm both in practice and in theory and propose an ‘annealing’ algorithm applicable to generic sparse regressions (including the fused/clustered lasso). Some effective accelerating techniques are further investigated to boost the convergence. The accelerated annealing (AA) algorithm, involving only matrix multiplications and thresholdings, can handle a large design matrix as well as a large sparsity pattern matrix.
Article information
Source
Electron. J. Statist. Volume 4 (2010), 1055-1096.
Dates
First available in Project Euclid: 12 October 2010
Permanent link to this document
http://projecteuclid.org/euclid.ejs/1286889184
Digital Object Identifier
doi:10.1214/10-EJS578
Mathematical Reviews number (MathSciNet)
MR2727453
Zentralblatt MATH identifier
1329.62327
Subjects
Primary: 62J07: Ridge regression; shrinkage estimators 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]
Keywords
Sparsity clustering thresholding lasso
Citation
She, Yiyuan. Sparse regression with exact clustering. Electron. J. Statist. 4 (2010), 1055--1096. doi:10.1214/10-EJS578. http://projecteuclid.org/euclid.ejs/1286889184.

