The Annals of Applied Statistics

Discussion of: Treelets—An adaptive multi-scale basis for sparse unordered data

Nicolai Meinshausen and Peter Bühlmann

Full-text: Open access

Abstract

We congratulate Lee, Nadler and Wasserman (henceforth LNW) on a very interesting paper on new methodology and supporting theory. Treelets seem to tackle two important problems of modern data analysis at once. For datasets with many variables, treelets give powerful predictions even if variables are highly correlated and redundant. Maybe more importantly, interpretation of the results is intuitive. Useful insights about relevant groups of variables can be gained.

Our comments and questions include: (i) Could the success of treelets be replicated by a combination of hierarchical clustering and PCA? (ii) When choosing a suitable basis, treelets seem to be largely an unsupervised method. Could the results be even more interpretable and powerful if treelets would take into account some supervised response variable? (iii) Interpretability of the result hinges on the sparsity of the final basis. Do we expect that the selected groups of variables will always be sufficiently small to be amenable for interpretation?

Article information

Source
Ann. Appl. Stat., Volume 2, Number 2 (2008), 478-481.

Dates
First available in Project Euclid: 3 July 2008

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1215118521

Digital Object Identifier
doi:10.1214/08-AOAS137C

Mathematical Reviews number (MathSciNet)
MR2524339

Zentralblatt MATH identifier
05591281

Citation

Meinshausen, Nicolai; Bühlmann, Peter. Discussion of: Treelets—An adaptive multi-scale basis for sparse unordered data. Ann. Appl. Stat. 2 (2008), no. 2, 478--481. doi:10.1214/08-AOAS137C. https://projecteuclid.org/euclid.aoas/1215118521


Export citation

References

  • Dettling, M. and Bühlmann, P. (2004). Finding predictive gene groups from microarray data., J. Multivariate Anal. 90 106–131.
  • Goeman, J. and Mansmann, U. (2008). Multiple testing on the directed acyclic graph of Gene Ontology., Bioinformatics 24 537–544.
  • Hastie, T., Tibshirani, R., Botstein, D. and Brown, P. (2001). Supervised harvesting of expression trees., Genome Biology 2 1–12.
  • Meinshausen, N. (2008). Hierarchical testing of variable importance., Biometrika 95 265–278.
  • Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net., J. Roy. Statist. Soc. Ser. B 67 301–320.