The Annals of Statistics

Object oriented data analysis: Sets of trees

Haonan Wang and J. S. Marron

Full-text: Open access


Object oriented data analysis is the statistical analysis of populations of complex objects. In the special case of functional data analysis, these data objects are curves, where standard Euclidean approaches, such as principal component analysis, have been very successful. Recent developments in medical image analysis motivate the statistical analysis of populations of more complex data objects which are elements of mildly non-Euclidean spaces, such as Lie groups and symmetric spaces, or of strongly non-Euclidean spaces, such as spaces of tree-structured data objects. These new contexts for object oriented data analysis create several potentially large new interfaces between mathematics and statistics. This point is illustrated through the careful development of a novel mathematical framework for statistical analysis of populations of tree-structured objects.

Article information

Ann. Statist. Volume 35, Number 5 (2007), 1849-1873.

First available in Project Euclid: 7 November 2007

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62H99: None of the above, but in this section
Secondary: 62G99: None of the above, but in this section

Functional data analysis nonlinear data space object oriented data analysis population of tree-structured objects principal component analysis


Wang, Haonan; Marron, J. S. Object oriented data analysis: Sets of trees. Ann. Statist. 35 (2007), no. 5, 1849--1873. doi:10.1214/009053607000000217.

Export citation


  • Banks, D. and Constantine, G. M. (1998). Metric models for random graphs. J. Classification 15 199--223.
  • Billera, L. J., Holmes, S. P. and Vogtmann, K. (2001). Geometry of the space of phylogenetic trees. Adv. in Appl. Math. 27 733--767.
  • Bullitt, E. and Aylward, S. (2002). Volume rendering of segmented image objects. IEEE Trans. Medical Imaging 21 998--1002.
  • Dryden, I. L. and Mardia, K. V. (1998). Statistical Shape Analysis. Wiley, Chichester.
  • Fisher, N. I. (1993). Statistical Analysis of Circular Data. Cambridge Univ. Press.
  • Fisher, N. I., Lewis, T. and Embleton, B. J. J. (1987). Statistical Analysis of Spherical Data. Cambridge Univ. Press.
  • Fletcher, P. T., Lu, C., Pizer, S. M. and Joshi, S. (2004). Principal geodesic analysis for the study of nonlinear statistics of shape. IEEE Trans. Medical Imaging 23 995--1005.
  • Fletcher, P. T., Lu, C. and Joshi, S. (2003). Statistics of shape via principal geodesic analysis on Lie groups. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition 1 95--101. IEEE, Los Alamitos, CA.
  • Hastie, T. and Stuetzle, W. (1989). Principal curves. J. Amer. Statist. Assoc. 84 502--516.
  • Mardia, K. V. (1972). Statistics of Directional Data. Academic Press, New York.
  • Mardia, K. V. and Jupp, P. E. (2000). Directional Statistics. Wiley, New York.
  • Larget, B., Simon, D. L. and Kadane, J. B. (2002). Bayesian phylogenetic inference from animal mitochondrial genome arrangements. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 681--693.
  • Locantore, N., Marron, J. S., Simpson, D. G., Tripoli, N., Zhang, J. T. and Cohen, K. L. (1999). Robust principal component analysis for functional data (with discussion). Test 8 1--73.
  • Margush, T. (1982). Distances between trees. Discrete Appl. Math. 4 281--290.
  • Pizer, S. M., Thall, A. and Chen, D. (2000). M-Reps: A new object representation for graphics. Available at
  • Ramsay, J. O. and Silverman, B. W. (2002). Applied Functional Data Analysis. Methods and Case Studies. Springer, New York.
  • Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed. Springer, New York.
  • Tschirren, J., Palágyi, K., Reinhardt, J. M., Hoffman, E. A. and Sonka, M. (2002). Segmentation, skeletonization and branchpoint matching---A fully automated quantitative evaluation of human intrathoracic airway trees. Proc. Fifth International Conterence on Medical Image Computing and Computer-Assisted Intervention, Part II. Lecture Notes in Comput. Sci. 2489 12--19. Springer, London.
  • Wang, H. (2003). Functional data analysis of populations of tree-structured objects. Ph.D. dissertation, Dept. Statistics, Univ. North Carolina at Chapel Hill.
  • Wang, H. and Marron, J. S. (2005). Object oriented data analysis: Sets of trees. Available at
  • Wikipedia (2006). Lie group. Available at
  • Wikipedia (2006). Riemannian symmetric space. Available at
  • Yushkevich, P., Pizer, S. M., Joshi, S. and Marron, J. S. (2001). Intuitive, localized analysis of shape variability. Information Processing in Medical Imaging. Lecture Notes in Comput. Sci. 2082 402--408. Springer, Berlin.