Abstract
Temporal text data such as news feeds cannot be adequately modeled by standard n-grams which correspond to multinomial or Markov chain models. Instead, we examine the application of local n-grams to modeling time stamped documents. We derive the asymptotic bias and variance and consider the bandwidth selection problem. Experimental results are presented on news feeds and web search query logs.
Citation
Guy Lebanon. Yang Zhao. Yanjun Zhao. "Modeling temporal text streams using the local multinomial model." Electron. J. Statist. 4 566 - 584, 2010. https://doi.org/10.1214/09-EJS522
Information