Papastamoulis Panagiotis, Furukawa Takanori, van Rhijn Norman, Bromley Michael, Bignell Elaine, Rattray Magnus
Department of Statistics, School of Information Sciences and Technology, Athens University of Economics and Business, Patision 76, 104 34Athens, Greece.
Division of Infection, Immunity & Respiratory Medicine, Faculty of Biology, Medicine & Health, University of Manchester, Manchester, UK.
Int J Biostat. 2019 Jul 25;16(1):ijb-2018-0052. doi: 10.1515/ijb-2018-0052.
We consider the situation where a temporal process is composed of contiguous segments with differing slopes and replicated noise-corrupted time series measurements are observed. The unknown mean of the data generating process is modelled as a piecewise linear function of time with an unknown number of change-points. We develop a Bayesian approach to infer the joint posterior distribution of the number and position of change-points as well as the unknown mean parameters. A-priori, the proposed model uses an overfitting number of mean parameters but, conditionally on a set of change-points, only a subset of them influences the likelihood. An exponentially decreasing prior distribution on the number of change-points gives rise to a posterior distribution concentrating on sparse representations of the underlying sequence. A Metropolis-Hastings Markov chain Monte Carlo (MCMC) sampler is constructed for approximating the posterior distribution. Our method is benchmarked using simulated data and is applied to uncover differences in the dynamics of fungal growth from imaging time course data collected from different strains. The source code is available on CRAN.
一个时间过程由具有不同斜率的连续段组成,并且观测到了经过噪声干扰的重复时间序列测量值。数据生成过程的未知均值被建模为时间的分段线性函数,其中变化点的数量未知。我们开发了一种贝叶斯方法来推断变化点的数量和位置以及未知均值参数的联合后验分布。先验地,所提出的模型使用了过多的均值参数,但在一组变化点的条件下,只有其中的一个子集影响似然性。变化点数量上的指数递减先验分布导致后验分布集中于基础序列的稀疏表示。构建了一个Metropolis-Hastings马尔可夫链蒙特卡罗(MCMC)采样器来近似后验分布。我们的方法通过模拟数据进行基准测试,并应用于从不同菌株收集的成像时间序列数据中揭示真菌生长动态的差异。源代码可在CRAN上获取。