The Annals of Applied Statistics

Tree models for difference and change detection in a complex environment

Yong Wang, Ilze Ziedins, Mark Holmes, and Neil Challands

Full-text: Open access


A new family of tree models is proposed, which we call “differential trees.” A differential tree model is constructed from multiple data sets and aims to detect distributional differences between them. The new methodology differs from the existing difference and change detection techniques in its nonparametric nature, model construction from multiple data sets, and applicability to high-dimensional data. Through a detailed study of an arson case in New Zealand, where an individual is known to have been laying vegetation fires within a certain time period, we illustrate how these models can help detect changes in the frequencies of event occurrences and uncover unusual clusters of events in a complex environment.

Article information

Ann. Appl. Stat., Volume 6, Number 3 (2012), 1162-1184.

First available in Project Euclid: 31 August 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Tree models change detection event data $p$-value adjustment arson case study


Wang, Yong; Ziedins, Ilze; Holmes, Mark; Challands, Neil. Tree models for difference and change detection in a complex environment. Ann. Appl. Stat. 6 (2012), no. 3, 1162--1184. doi:10.1214/12-AOAS548.

Export citation


  • Basseville, M. and Nikiforov, I. V. (1993). Detection of Abrupt Changes: Theory and Application. Prentice Hall, Englewood Cliffs, NJ.
  • Breiman, L. (1996a). Bagging predictors. Machine Learning 24 123–140.
  • Breiman, L. (1996b). Heuristics of instability and stabilization in model selection. Ann. Statist. 24 2350–2383.
  • Breiman, L. (2001). Random forests. Machine Learning 45 5–32.
  • Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984). Classification and Regression Trees. Wadsworth, Belmont, CA.
  • Chaudhuri, P., Lo, W. D., Loh, W.-Y. and Yang, C. C. (1995). Generalized regression trees. Statist. Sinica 5 641–666.
  • Davis, R. B. and Anderson, J. R. (1989). Exponential survival trees. Stat. Med. 8 947–961.
  • Freund, Y. and Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. System Sci. 55 119–139.
  • Glaz, J., Naus, J. and Wallenstein, S. (2001). Scan Statistics. Springer, New York.
  • Gustafsson, F. (2000). Adaptive Filtering and Change Detection. Wiley, Chichester, UK.
  • Ishwaran, H., Kogalur, U. B., Blackstone, E. H. and Lauer, M. S. (2008). Random survival forests. Ann. Appl. Stat. 2 841–860.
  • Kass, G. V. (1980). An exploratory technique for investigating large quantities of categorical data. J. Appl. Stat. 29 119–127.
  • Lai, T. L. (1995). Sequential changepoint detection in quality control and dynamical systems. J. Roy. Statist. Soc. Ser. B 57 613–658.
  • MacEachern, S. N., Rao, Y. and Wu, C. (2007). A robust-likelihood cumulative sum chart. J. Amer. Statist. Assoc. 102 1440–1447.
  • Morgan, J. N. and Sonquist, J. A. (1963). Problems in the analysis of survey data, and a proposal. J. Amer. Statist. Assoc. 58 415–434.
  • Naus, J. I. (1965). The distribution of the size of the maximum cluster of points on a line. J. Amer. Statist. Assoc. 60 532–538.
  • Page, E. S. (1954). Continuous inspection schemes. Biometrika 41 100–115.
  • Poor, H. V. and Hadjiliadis, O. (2009). Quickest Detection. Cambridge Univ. Press, Cambridge.
  • Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA.
  • R Development Core Team (2011). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
  • Shewhart, W. A. (1931). Economic Control of Manufactured Products. Van Nostrand-Reinhold, New York.
  • Su, X., Wang, M. and Fan, J. (2004). Maximum likelihood regression trees. J. Comput. Graph. Statist. 13 586–598.
  • Therneau, T. M. and Atkinson, E. J. (1997). An introduction to recursive partitioning using the rpart routine. Technical Report 61, Section of Biostatistics, Mayo Clinic, Rochester, NY.
  • Wang, Y., Ziedins, I., Holmes, M. and Challands, N. (2012). Supplement to “Tree models for difference and change detection in a complex environment”. DOI:10.1214/12-AOAS548SUPPA, DOI:10.1214/12-AOAS548SUPPB.

Supplemental materials