Open Access
October 2018 Change-point detection in multinomial data with a large number of categories
Guanghui Wang, Changliang Zou, Guosheng Yin
Ann. Statist. 46(5): 2020-2044 (October 2018). DOI: 10.1214/17-AOS1610


We consider a sequence of multinomial data for which the probabilities associated with the categories are subject to abrupt changes of unknown magnitudes at unknown locations. When the number of categories is comparable to or even larger than the number of subjects allocated to these categories, conventional methods such as the classical Pearson’s chi-squared test and the deviance test may not work well. Motivated by high-dimensional homogeneity tests, we propose a novel change-point detection procedure that allows the number of categories to tend to infinity. The null distribution of our test statistic is asymptotically normal and the test performs well with finite samples. The number of change-points is determined by minimizing a penalized objective function based on segmentation, and the locations of the change-points are estimated by minimizing the objective function with the dynamic programming algorithm. Under some mild conditions, the consistency of the estimators of multiple change-points is established. Simulation studies show that the proposed method performs satisfactorily for identifying change-points in terms of power and estimation accuracy, and it is illustrated with an analysis of a real data set.


Download Citation

Guanghui Wang. Changliang Zou. Guosheng Yin. "Change-point detection in multinomial data with a large number of categories." Ann. Statist. 46 (5) 2020 - 2044, October 2018.


Received: 1 December 2016; Revised: 1 July 2017; Published: October 2018
First available in Project Euclid: 17 August 2018

zbMATH: 06964324
MathSciNet: MR3845009
Digital Object Identifier: 10.1214/17-AOS1610

Primary: 62H15
Secondary: 62H12

Keywords: asymptotic normality , categorical data , high-dimensional homogeneity test , multiple change-point detection , sparse contingency table

Rights: Copyright © 2018 Institute of Mathematical Statistics

Vol.46 • No. 5 • October 2018
Back to Top