We consider the problem of estimating a sparse multi-response
regression function, with an application to expression
quantitative trait locus (eQTL) mapping, where the goal is to
discover genetic variations that influence gene-expression
levels. In particular, we investigate a shrinkage technique
capable of capturing a given hierarchical structure over the
responses, such as a hierarchical clustering tree with leaf
nodes for responses and internal nodes for clusters of related
responses at multiple granularity, and we seek to leverage this
structure to recover covariates relevant to each
hierarchically-defined cluster of responses. We propose a
tree-guided group lasso, or tree lasso, for estimating
such structured sparsity under multi-response regression by
employing a novel penalty function constructed from the tree. We
describe a systematic weighting scheme for the overlapping
groups in the tree-penalty such that each regression coefficient
is penalized in a balanced manner despite the inhomogeneous
multiplicity of group memberships of the regression coefficients
due to overlaps among groups. For efficient optimization, we
employ a smoothing proximal gradient method that was originally
developed for a general class of structured-sparsity-inducing
penalties. Using simulated and yeast data sets, we demonstrate
that our method shows a superior performance in terms of both
prediction errors and recovery of true sparsity patterns,
compared to other methods for learning a multivariate-response
regression.
References
Beck, A. and Teboulle, M. (2009). A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2 183–202.
Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge Univ. Press, Cambridge.
Chen, Y., Zhu, J., Lum, P. K., Yang, X., Pinto, S., MacNeil, D. J., Zhang, C., Lamb, J., Edwards, S., Sieberts, S. K. et al. (2008). Variations in DNA elucidate molecular networks that cause disease. Nature 452 429–435.
Chen, X., Lin, Q., Kim, S., Carbonell, J. and Xing, E. P. (2011). Smoothing proximal gradient method for general structured sparse learning. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI) 105–114. AUAI Press, Corvallis, OR.
Cheung, V., Spielman, R., Ewens, K., Weber, T., Morley, M. and Burdick, J. (2005). Mapping determinants of human gene expression by regional and genome-wide association. Nature 437 1365–1369.
Emilsson, V., Thorleifsson, G., Zhang, B., Leonardson, A. S., Zink, F., Zhu, J., Carlson, S., Helgason, A., Walters, G. B., Gunnarsdottir, S. et al. (2008). Genetics of gene expression and its effect on disease. Nature 452 423–428.
Friedman, J., Hastie, T. and Tibshirani, R. (2010). A note on the group lasso and a sparse group lasso. Technical report, Dept. Statistics, Stanford Univ., Stanford, CA.
Friedman, J., Hastie, T., Höfling, H. and Tibshirani, R. (2007). Pathwise coordinate optimization. Ann. Appl. Stat. 1 302–332.
Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., Bloomfield, C. D. and Lander, E. S. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286 531–537.
Hastie, T., Tibshirani, R., Botstein, D. and Brown, P. (2001). Supervised harvesting of expression trees. Genome Biol. 2 0003.1–0003.12.
Jacob, L., Obozinski, G. and Vert, J. (2009). Group lasso with overlap and graph lasso. In Proceedings of the 26th International Conference on Machine Learning. ACM, New York.
Jenatton, R., Audibert, J. and Bach, F. (2009). Structured variable selection with sparsity-inducing norms. Technical report, INRIA.
Kim, S. and Xing, E. P. (2009). Statistical estimation of correlated genome associations to a quantitative trait network. PLoS Genetics 5 e1000587.
Kim, S. and Xing, E. P. (2012). Supplement to “Tree-guided group lasso for multi-response regression with structured sparsity, with an application to eQTL mapping.”
DOI:10.1214/12-AOAS549SUPP.
Lee, S. I., Pe’er, D., Dudley, A., Church, G. and Koller, D. (2006). Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification. Proc. Natl. Acad. Sci. USA 103 14062–14067.
Obozinski, G., Taskar, B. and Jordan, M. I. (2010). Joint covariate selection and joint subspace selection for multiple classification problems. Stat. Comput. 20 231–252.
Obozinski, G., Wainwright, M. J. and Jordan, M. J. (2008). High-dimensional union support recovery in multivariate regression. In Advances in Neural Information Processing Systems 21. MIT Press, Cambridge, MA.
Pujana, M. A., Han, J. J., Starita, L. M., Stevens, K. N., Tewari, M., Ahn, J. S., Rennert, G., Moreno, V., Kirchhoff, T., Gold, B. et al. (2007). Network modeling links breast cancer susceptibility and centrosome dysfunction. Nature Genetics 39 1338–1349.
Segal, E., Shapira, M., Regev, A., Pe’er, D., Botstein, D., Koller, D. and Friedman, N. (2003). Module networks: Identifying regulatory modules and their condition-specific regulators from gene expression data. Nature Genetics 34 166–178.
Sørlie, T., Perou, C. M., Tibshirani, R., Aas, T., Geisler, S., Johnsen, H., Hastie, T., Eisen, M. B., van de Rijn, M., Jeffrey, S. S., Thorsen, T., Quist, H., Matese, J. C., Brown, P. O., Botstein, D., Lønning, P. E. and Børresen-Dale, A. (2001). Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl. Acad. Sci. USA 98 10869–10874.
Stranger, B., Forrest, M., Clark, A., Minichiello, M., Deutsch, S., Lyle, R., Hunt, S., Kahl, B., Antonarakis, S., Tavare, S. et al. (2005). Genome-wide associations of gene expression variation in humans. PLoS Genetics 1 695–704.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
Wu, T. T., Chen, Y. F., Hastie, T., Sobel, E. and Lange, K. (2009). Genome-wide association analysis by lasso penalized logistic regression. Bioinformatics 25 714–721.
Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 49–67.
Yuan, X. and Yan, S. (2010). Visual classification with multi-task joint sparse representation. In Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society Press, Los Alamitos, CA.
Zhang, Y. (2010). Multi-task active learning with output constraints. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI). AAAI Press, Menlo Park, CA.
Zhang, B. and Horvath, S. (2005). A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. 4 Art. 17, 45 pp. (electronic).
Zhao, P., Rocha, G. and Yu, B. (2009). The composite absolute penalties family for grouped and hierarchical variable selection. Ann. Statist. 37 3468–3497.
Zhou, Y., Jin, R. and Hoi, S. C. H. (2010). Exclusive lasso for multi-task feature selection. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS). JMLR W&CP.
Zhu, J., Zhang, B., Smith, E. N., Drees, B., Brem, R. B., Kruglyak, L., Bumgarner, R. E. and Schadt, E. E. (2008). Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nature Genetics 40 854–861.
Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 301–320.