Bayesian analysis of continuous time, discrete state space time series is an important and challenging problem, where incomplete observation and large parameter sets call for user-defined priors based on known properties of the process. Generalized linear models have a largely unexplored potential to construct such prior distributions. We show that an important challenge with Bayesian generalized linear modelling of continuous time Markov chains is that classical Markov chain Monte Carlo techniques are too ineffective to be practical in that setup. We address this issue using an auxiliary variable construction combined with an adaptive Hamiltonian Monte Carlo algorithm. This sampling algorithm and model make it efficient both in terms of computation and analyst’s time to construct stochastic processes informed by prior knowledge, such as known properties of the states of the process. We demonstrate the flexibility and scalability of our framework using synthetic and real phylogenetic protein data, where a prior based on amino acid physicochemical properties is constructed to obtain accurate rate matrix estimates.
"Bayesian Analysis of Continuous Time Markov Chains with Application to Phylogenetic Modelling." Bayesian Anal. 11 (4) 1203 - 1237, December 2016. https://doi.org/10.1214/15-BA982